http://open-source-security-software.net/project/netdata/releases.atom Recent releases for netdata 2024-05-17T10:51:05.638995+00:00 python-feedgen netdata v0.2 netdata v0.2 2015-09-25T22:01:16+00:00 2015-09-25T22:01:16+00:00 netdata v1.0rc netdata v1.0rc 2016-02-09T22:46:12+00:00 2016-02-09T22:46:12+00:00 netdata v1.0.0 netdata v1.0.0 2016-03-22T21:24:07+00:00 netdata 1.0.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.0.0 2016-03-22T21:24:07+00:00 netdata v1.1.0 netdata v1.1.0 2016-04-20T19:09:33+00:00 netdata 1.1.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.1.0 Dozens of commits that improve netdata in several ways: ### Data collection - added IPv6 monitoring - added SYNPROXY DDoS protection monitoring - apps.plugin: added charts for users and user groups - apps.plugin: grouping of processes now support patterns - apps.plugin: now it is faster, after the new features added - better auto-detection of partitions for disk monitoring - better fireqos intergation for QoS monitoring - squid monitoring now uses squidclient - SNMP monitoring now supports 64bit counters ### API - fixed issues in CSV output generation - netdata can now be restricted to listen on a specific IP (API and web server) ### Core - added error log flood protection ### Web Dashboard - better error handling when the netdata server is unreachable - each chart now has a toolbox - on-line help support - check for netdata updates button - added example /tv.html dashboard - now compiles with musl libc (alpine linux) ### Packaging - added debian packaging - support non-root installations - the installer generates uninstall script 2016-04-20T19:09:33+00:00 netdata v1.2.0 netdata v1.2.0 2016-05-16T20:18:06+00:00 ### Netdata demo sites: [http://my-netdata.io](http://my-netdata.io) ### At a glance 1. netdata now is **30% faster** ! 2. netdata now has a **[registry](https://github.com/firehol/netdata/wiki/mynetdata-menu-item)** (**[my-netdata](https://github.com/firehol/netdata/wiki/mynetdata-menu-item)** dashboard menu) ! 3. netdata now monitors **Linux Containers** (cgroups, docker, lxc, etc) ! > IMPORTANT: > This version requires libuuid. The package you need to build netdata is: > - uuid-dev (debian/ubuntu), or > - libuuid-devel (centos/fedora/redhat) ### In detail #### netdata is now 30% faster ! - Patches submitted by @fredericopissarra improved overall netdata performance by 10%. - A new improved search function in the internal indexes made all searches faster by 50%, resulting in about 20% better performance for the core of netdata. - More efficient threads locking in key components contributed to the overall speed up. #### netdata now has a **[central registry](https://github.com/firehol/netdata/wiki/mynetdata-menu-item)** ! The central registry tracks all your netdata servers and bookmarks them for you at the **[my-netdata](https://github.com/firehol/netdata/wiki/mynetdata-menu-item)** menu on all dashboards. Every netdata can act as a registry, but there is also a global registry provided for free for all netdata users! #### netdata now monitors **Linux Containers** ! docker, lxc, or anything else. For each container it monitors CPU, RAM, DISK I/O (network interfaces were already monitored). #### Other improvements - apps.plugin: now uses linux capabilities by default without setuid to root - netdata has now an improved signal handler thanks to @simonnagl - API: new improved CORS support - SNMP: counter64 support fixed - MYSQL: more charts, about QCache, MyISAM key cache, InnoDB buffer pools, open files - DISK charts now show mount point when available - Dashboard: improved support for older web browsers and mobile web browsers (thanks to @simonnagl) - Multi-server dashboards now allow de-coupled refreshes for each chart, so that if one netdata has a network latency the other charts are not affected - Dozens of other improvements, optimizations and bug-fixes. netdata 1.2.0 - download release tarfiles also from http://firehol.org/download/netdata/releases/v1.2.0 2016-05-16T20:18:06+00:00 netdata v1.3.0 netdata v1.3.0 2016-08-27T21:48:47+00:00 ### New to netdata? Check its demo: [http://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v41)](https://registry.my-netdata.io/#netdata_registry) ### At a glance 1. netdata has **[health monitoring / alarms](https://github.com/firehol/netdata/wiki/health-monitoring)**! 2. netdata **[generates badges](https://github.com/firehol/netdata/wiki/Generating-Badges)** that can be embeded anywhere! 3. netdata plugins are now written in python! 4. new plugins: redis, memcached, nginx_log, ipfs, apache_cache > IMPORTANT: > Since netdata now uses python plugins, new packages are > required to be installed on a system to allow it work. > For more information, please check the **[installation page](https://github.com/firehol/netdata/wiki/Installation)**. ### In detail #### netdata has alarms! Based on the [POLL we made on github](https://github.com/firehol/netdata/issues/436), health monitoring was the winner. **So here it is!** netdata now has a **[powerful health monitoring](https://github.com/firehol/netdata/wiki/health-monitoring)** system embedded. ![image](https://cloud.githubusercontent.com/assets/2662304/18042169/f002baa8-6dc7-11e6-801d-8ec8e45453ae.png) #### netdata has badges! netdata can **[generate badges](https://github.com/firehol/netdata/wiki/Generating-Badges)** with live information from the collected metrics. #### netdata plugins are now written in python! Thanks to the great work of Paweł Krupa (@paulfantom), most BASH plugins have been ported to python. The new python.d.plugin supports both **python2** and **python3** and data collection from multiple sources for all modules. The following pre-existing modules have been ported to python: - apache - cpufreq - example - exim - hddtemp - mysql - nginx - phpfm - postfix - sensors - squid - tomcat The following new modules have been added: - apache_cache - dovecot - ipfs - memcached - nginx_log - redis #### other data collectors Thanks to @simonnagl netdata now reports disk space usage. #### other improvements - dashboards now transfer certain settings from server to server when changing servers via the my-netdata menu. The settings transferred are the dashboard theme, the online help status and current pan and zoom timeframe of the dashboard. - API improvements: - reduction functions now support 'min', 'sum' and 'incremental-sum'. - netdata now offers a multi-threaded and a single threaded web server (single threaded is better for IoT). - apps.plugin improvements: - can now run with command line argument 'without-files' to prevent it from enumating all the open files/sockets/pipes of all running processes. - apps.plugin now scales the collected values to match the the total system usage. - apps.plugin can now report guest CPU usage per process. - repeating errors are now logged once per process. - netdata now runs with IDLE process priority (lower than nice 19) - netdata now instructs the kernel to kill it first when it starves for memory. - netdata listens for signals: - SIGHUP to netdata instructs it to re-open its log files (new logrotate file added too). - SIGUSR1 to netdata saves the database - SIGUSR2 to netdata reloads health / alarms configuration - netdata can now bind to multiple IPs and ports. - netdata now has new systemd service file (it starts as user netdata and does not fork). - Dozens of other improvements and bugfixes netdata 1.3.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.3.0 2016-08-27T21:48:47+00:00 netdata v1.4.0 netdata v1.4.0 2016-10-03T23:02:23+00:00 ### New to netdata? Check its demo: [http://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > Release announced on [Hacker News](https://news.ycombinator.com/item?id=12633821) > Release announced on [reddit r/linux](https://www.reddit.com/r/linux/comments/55qmvn/netdata_the_opensource_realtime_performance_and/) > Release announced on [reddit r/sysadmin](https://www.reddit.com/r/sysadmin/comments/55qplu/just_released_netdata_v140/) > Release announced on [twitter](https://twitter.com/linuxnetdata/status/783211824711884800) ### At a glance - **the fastest netdata ever** (with a better look too)! - improved **IoT** and **containers** support! - **alarms** improved in almost **every** way! - new plugins: - softnet netdev, - extended TCP metrics, - UDPLite - NFS v2, v3 client (server was there already), - NFS v4 server & client, - APCUPSd, - RetroShare - improved plugins: - mysql, - cgroups, - hddtemp, - sensors, - phpfm, - tc (QoS) ### In detail #### improved alarms! Many new alarms have been added to detect common kernel configuration errors and old alarms have been re-worked to avoid notification floods. Alarms now support: - **notification hysteresis** (both static and **dynamic**) ![image](https://cloud.githubusercontent.com/assets/2662304/19057563/6ace3096-89d9-11e6-98f8-4da9d62a575f.png) - notification **self-cancellation**, and - **dynamic thresholds** based on current alarm status ![image](https://cloud.githubusercontent.com/assets/2662304/19057546/43f8a366-89d9-11e6-9833-6d15da3faa8f.png) Also, a new alarms log: ![image](https://cloud.githubusercontent.com/assets/2662304/19057499/f2138fca-89d8-11e6-8513-4f717c0fb861.png) #### improved alarm notifications netdata now supports: - email notifications - **slack.com** notifications on slack channels - **pushover.net** notifications (mobile push notifications) - **telegram.org** notifications For all the above methods, netdata supports **role-based** notifications, with **multiple recipients** for each role and **severity filtering** per recipient! Also, netdata support **HTML5 notifications**, while the dashboard is open in a browser window (no need to be the active one). ![image](https://cloud.githubusercontent.com/assets/2662304/18407279/82bac6a6-7714-11e6-847e-c2e84eeacbfb.png) All notifications (HTML5, emails, slack, pushover, telegram) are now **clickable** to get to the chart that raised the alarm. #### other improvements - improved **IoT** support! netdata builds and runs with musl libc and runs on systems based on busybox. - improved **containers** support! netdata runs on **alpine** linux (a low profile linux distribution used in containers). - Dozens of other improvements and bugfixes --- netdata 1.4.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.4.0 2016-10-03T23:02:23+00:00 netdata v1.5.0 netdata v1.5.0 2017-01-22T21:28:30+00:00 ### New to netdata? Check its demo: [http://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > Release announced on [twitter](https://twitter.com/linuxnetdata/status/823844302698713088), [hacker news](https://news.ycombinator.com/item?id=13470333), [reddit r/linux](https://www.reddit.com/r/linux/comments/5pvgar/netdata_the_opensource_realtime_performance/), [reddit r/sysadmin](https://www.reddit.com/r/sysadmin/comments/5pvg5n/netdata_the_opensource_realtime_performance/), [reddit r/linuxadmin](https://www.reddit.com/r/linuxadmin/comments/5pvgfb/netdata_the_opensource_realtime_performance/), [reddit r/freebsd](https://www.reddit.com/r/freebsd/comments/5pvgk8/netdata_the_realtime_performance_monitoring/) Yet another release that makes netdata the fastest netdata ever! This is probably the release with the largest changeset so far. A lot of work, by a lot of people made this release possible! ## FreeBSD, MacOS and FreeNAS **[Vladimir Kobal](https://github.com/vlvkobal)** has done a magnificent work porting netdata to FreeBSD and MacOS. Everything works: - cpu and interrupts, memory, disks (performance and space monitoring) - network interfaces and softnet - IPv4 and IPv6 metrics - processes and context switches - IPC (queues, semaphores, shared memory) - and of course all the netdata external plugins **Wow!** Check it live on FreeBSD, at **[https://freebsd.my-netdata.io/](https://freebsd.my-netdata.io/)** ## Backends netdata supports **[data archiving to backend databases](https://github.com/firehol/netdata/wiki/netdata-backends)**: - Graphite - OpenTSDB - Prometheus and of course all the compatible ones (KairosDB, InfluxDB, Blueflood, etc) ![image](https://cloud.githubusercontent.com/assets/2662304/20649711/29f182ba-b4ce-11e6-97c8-ab2c0ab59833.png) With this feature netdata can interface with your existing devops infrastructure and allow you to visualize its metrics with other tools, like grafana. ## New Plugins **[Ilya Mashchenko](https://github.com/l2isbad)** has created most of the python data collection plugins in this release! He rocks! - **Systemd Services** (real-time monitoring of the resource utilization of all systemd services, using cgroups!) - **FPing** (network latency and jitter monitoring with netdata!) - **Postgres** databases @facetoe, @moumoul - **Vanish** disk cache (v3 and v4) @l2isbad - **ElasticSearch** @l2isbad - **HAproxy** @l2isbad - **FreeRadius** @l2isbad, @lgz - **mdstat** (RAID) @l2isbad - **ISC bind** (via rndc) @l2isbad - **ISC dhcpd** @l2isbad, @lgz - **Fail2Ban** @l2isbad - **OpenVPN** status log @l2isbad, @lgz - **NUMA** memory @tycho - **CPU Idle States** @tycho - **gunicorn** @deltaskelta - **ECC memory hardware errors** - **IPC** semaphores - **uptime** ( with a nice badge too: [![uptime badge](https://registry.my-netdata.io/api/v1/badge.svg?chart=system.uptime&v42)](https://registry.my-netdata.io/) ) ## Improved Plugins - **netfilter conntrack** - **MySQL/MariaDB** (replication) @l2isbad - **ipfs** @pjz - **cpufreq** @tycho - **hddtemp** @l2isbad - **sensors** @l2isbad - **nginx** @leolovenet - **nginx_log** @paulfantom - **phpfpm** @leolovenet - **redis** @leolovenet - **dovecot** @justohall - **cgroups** - **disk space** - **apps.plugin** - **/proc/interrupts** @rlefevre - **/proc/softirqs** @rlefevre - **/proc/vmstat** (system memory charts) - **/proc/net/snmp6** (IPv6 charts) - **/proc/self/meminfo** (system memory charts) - **/proc/net/dev** (network interfaces) - **tc** (linux QoS) ## New and Improved Alarms - **MySQL/MariaDB** alarms (incl. replication) - **IPFS** alarms - **HAproxy** alarms - **UDP buffer** alarms - **TCP AttemptFails** - **ECC memory alarms** - **netfilter connections** alarms ## New Alarm Notification Methods - **messagebird.com** @tech-no-logical - **pagerduty.com** @jimcooley - **pushbullet.com** @tperalta82 - **twilio.com** @shadycuz - **HipChat** - **kafka** ## Shell Integration Shell scripts can now query netdata easily! ``` sh eval "$(curl -s 'http://localhost:19999/api/v1/allmetrics')" ``` after this command, all the netdata metrics are exposed to shell. Check: ``` sh # source the metrics eval "$(curl -s 'http://localhost:19999/api/v1/allmetrics')" # let's see if there are variables exposed by netdata for system.cpu set | grep "^NETDATA_SYSTEM_CPU" NETDATA_SYSTEM_CPU_GUEST=0 NETDATA_SYSTEM_CPU_GUEST_NICE=0 NETDATA_SYSTEM_CPU_IDLE=95 NETDATA_SYSTEM_CPU_IOWAIT=0 NETDATA_SYSTEM_CPU_IRQ=0 NETDATA_SYSTEM_CPU_NICE=0 NETDATA_SYSTEM_CPU_SOFTIRQ=0 NETDATA_SYSTEM_CPU_STEAL=0 NETDATA_SYSTEM_CPU_SYSTEM=1 NETDATA_SYSTEM_CPU_USER=4 NETDATA_SYSTEM_CPU_VISIBLETOTAL=5 # let's see the total cpu utilization of the system echo ${NETDATA_SYSTEM_CPU_VISIBLETOTAL} 5 # what about alarms? set | grep "^NETDATA_ALARM_SYSTEM_SWAP_" NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_STATUS=CRITICAL NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_VALUE=53 NETDATA_ALARM_SYSTEM_SWAP_USED_SWAP_STATUS=CLEAR NETDATA_ALARM_SYSTEM_SWAP_USED_SWAP_VALUE=51 # let's get the current status of the alarm 'ram in swap' echo ${NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_STATUS} CRITICAL # is it fast? time curl -s 'http://localhost:19999/api/v1/allmetrics' >/dev/null real 0m0,070s user 0m0,000s sys 0m0,007s # it is... # 0.07 seconds for curl to be loaded, connect to netdata and fetch the response back... ``` The `_VISIBLETOTAL` variable sums up all the dimensions of each chart. The format of the variables is: `NETDATA_${chart_id^^}_${dimension_id^^}="${value}"` The `value` is rounded to the closest integer, since shell script cannot process decimal numbers. ## Dashboard Improvements - dashboard is now faster on firefox, safari, opera, edge (edge is still the slowest) - dashboard charts legends now have bigger fonts - SHIFT + mousewheel to zoom charts, works on all browsers - perfect-scrollbar on the dashboard - dashboard 4K resolution fixes - dashboard compatibility fixes for embedding charts in third party web sites - charts on custom dashboards can have common min/max even if they come from different netdata servers - alarm log is now saved and loaded back so that the alarm history is available at the dashboard ## Other Improvements - python.d.plugin has received way to many improvements from many contributors! - charts.d.plugin can now be forked to support multiple independent instances - registry has been re-factored to lower its memory requirements (required for the public registry) - simple patterns in cgroups, disks and alarms - netdata-installer.sh can now correctly install netdata in containers - supplied logrotate script compatibility fixes - spec cleanup @breed808 - clocks and timers reworked @rlefevre netdata has received a lot more improvements from many more contributors! (it was really a lot of work to dig into git log to collect all the above, so forgive me if I forgot to mention a few contributions and contributors). Thank you all! 2017-01-22T21:28:30+00:00 netdata v1.6.0 netdata v1.6.0 2017-03-20T18:36:43+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > Release announced on [twitter](https://twitter.com/linuxnetdata/status/843895489431044097), [hacker news](https://news.ycombinator.com/item?id=13916596), [reddit r/linux](https://www.reddit.com/r/linux/comments/60idga/netdata_the_opensource_realtime_performance/), [reddit r/sysadmin](https://www.reddit.com/r/sysadmin/comments/60ieet/netdata_the_opensource_realtime_performance/), [reddit r/linuxadmin](https://www.reddit.com/r/linuxadmin/comments/60iebz/netdata_the_opensource_realtime_performance/), [reddit r/freebsd](https://www.reddit.com/r/freebsd/comments/60ie8j/netdata_the_opensource_realtime_performance/) [reddit r/devops](https://www.reddit.com/r/devops/comments/60mrwv/netdata_the_opensource_realtime_performance_and/) [reddir r/homelab](https://www.reddit.com/r/homelab/comments/60jc00/netdata_the_opensource_realtime_performance/) [facebook](https://www.facebook.com/linuxnetdata/posts/1721842394773573:0) ## birthday release: **1 year netdata** netdata was first published on **March 30th, 2016**. It has been a crazy year since then: <p align="center"> <b>225.000</b> unique netdata users<br/> <i>currently, at 1.000 new unique users per day</i> <br/>&nbsp;<br/> <b>80.000</b> unique netdata installations<br/> <i>currently, at 500 new unique installations per day</i> <br/>&nbsp;<br/> <b>610.000</b> docker pulls on docker hub<br/> <br/> <b>4.000.000</b> netdata sessions served<br/> <i>currently, at 15.000 unique netdata sessions served per day</i> <br/>&nbsp;<br/> <b>20.000</b> github stars<br/> <br/> Thank you!<br/> You are <b>awesome</b>!<br/> </p> ## Central netdata is here! This is the first release that supports real-time streaming of metrics between netdata servers. netdata can now be: - **autonomous host monitoring** (like it always has been) - **headless data collector** (collect and stream metrics in real-time to another netdata) - **headless proxy** (collect metrics from multiple netdata and stream them to another netdata) - **store and forward proxy** (like **headless proxy**, but with a local database) - **central database** (metrics from multiple hosts are aggregated) metrics databases can be configured on all nodes and each node maintaining a database may have a **different retention** policy and possibly run (even different) **alarms** on them. <p align="center"> <img src="https://cloud.githubusercontent.com/assets/2662304/23629551/bb1fd9c2-02c0-11e7-90f5-cab5a3ed4c53.png"/> </p> There are 4 settings that control what netdata can be: 1. `[global].memory mode` in `netdata.conf`, controls if a netdata will maintain a **local database** and the type of it. For more information check [Running a dedicated central netdata server](https://github.com/firehol/netdata/wiki/Memory-Requirements#running-a-dedicated-central-netdata-server). 2. `[web].mode` in `netdata.conf`, controls if netdata will **expose its API**, and the type of web server to enable (single or multi-threaded). Check [netdata.conf configuration for streaming](https://github.com/firehol/netdata/wiki/Replication-Overview#netdataconf-configuration). 3. `[stream].enabled` in `stream.conf`, controls if netdata will **stream its metrics to another netdata**. Check [stream.conf for sending metrics](https://github.com/firehol/netdata/wiki/Replication-Overview#streaming-configuration). 4. `[API KEY].enabled` in `stream.conf`, controls if netdata will **accept metrics from other netdata**. Check [stream.conf for receiving metrics](https://github.com/firehol/netdata/wiki/Replication-Overview#options-for-the-receiving-node). Using the above, we support a lot of different configurations, like these: target | memory<br/>mode | web<br/>mode | stream<br/>enabled | send to<br/>backend | local<br/>alarms | local<br/>dashboard -------|:-----------:|:---:|:------:|:-------:|:---------:|:----: **headless collector**|`none`|`none`|`yes`|not possible|not possible|no **headless proxy**|`none`|not `none`|`yes`|not possible|not possible|no **proxy with db**|not `none`|not `none`|`yes`|possible|possible|yes **central netdata**|not `none`|not `none`|`no`|possible|possible|yes ### monitoring ephemeral nodes netdata now supports monitoring autoscaled ephemeral nodes, that are started and stopped on demand (their IP is not known). <p align="center"> <img src="https://cloud.githubusercontent.com/assets/2662304/23627295/e3569adc-02b8-11e7-9d55-4014bf98c1b3.png"/> </p> When the ephemeral nodes start streaming metrics to the central netdata, the central netdata will show register them at `my-netdata` menu on the dashboard, like this: <p align="center"> <img src="https://cloud.githubusercontent.com/assets/2662304/24080824/24cd2d3c-0caf-11e7-909d-a8dd1dbb95d7.png"/> </p> You can see this live at [https://build.my-netdata.io](https://build.my-netdata.io) (this server may not always be available for demo). For more information check: [monitoring ephemeral nodes](https://github.com/firehol/netdata/wiki/monitoring-ephemeral-nodes). ### monitoring ephemeral containers and VM guests netdata now cleans up container, guest VM, network interfaces and mounted disk metrics, disabling automatically their alarms too. For more information check [monitoring ephemeral containers](https://github.com/firehol/netdata/wiki/monitoring-ephemeral-containers). ## apps.plugin ported for FreeBSD **[Vladimir Kobal](https://github.com/vlvkobal)** has ported `apps.plugin` to FreeBSD. netdata can now provide `Applications`, `Users` and `User Groups` under FreeBSD too: Also, the CPU utilization of netdata under FreeBSD, is now a lot less compared to netdata v1.5. See it live at our **[FreeBSD demo server](http://freebsd.my-netdata.io/#menu_apps)**. ## web_log plugin [Ilya Mashchenko](https://github.com/l2isbad) has done a wonderful job creating a **unified web log parsing plugin** for all kinds of web server logs. With it, netdata provides real-time performance information and health monitoring alarms for web applications and web sites! Requests by http status: ![image](https://cloud.githubusercontent.com/assets/2662304/22902194/ea0affc6-f23c-11e6-85f1-a4951dd4bb40.png) Requests by http status code family: ![image](https://cloud.githubusercontent.com/assets/2662304/22901883/dea7d33a-f23b-11e6-960d-00a913b58936.png) Requests by http status code: ![image](https://cloud.githubusercontent.com/assets/2662304/22901965/1a5d84ba-f23c-11e6-9d38-3deebcc8b879.png) Requests bandwidth: ![image](https://cloud.githubusercontent.com/assets/2662304/22902266/245141d6-f23d-11e6-90f9-98729733e0da.png) Requests timings: ![image](https://cloud.githubusercontent.com/assets/2662304/22902283/369e3f92-f23d-11e6-9359-53e5d4ecb18e.png) URL patterns of interest (you configure the patterns): ![image](https://cloud.githubusercontent.com/assets/2662304/22902302/4d25bf06-f23d-11e6-844d-18c0876bdc3d.png) Requests by http method: ![image](https://cloud.githubusercontent.com/assets/2662304/22902323/5ee376d4-f23d-11e6-8457-157d3f438843.png) Requests by IP version: ![image](https://cloud.githubusercontent.com/assets/2662304/22902370/7091a770-f23d-11e6-8cd2-74e9a67b1397.png) Number of unique clients: ![image](https://cloud.githubusercontent.com/assets/2662304/22902384/835aa168-f23d-11e6-914f-cfc3f06eaff8.png) and a lot more, including **alarms**: alarm|description|minimum<br/>requests|warning|critical :-------|-------|:------:|:-----:|:------: `1m_redirects`|The ratio of HTTP redirects (3xx except 304) over all the requests, during the last minute.<br/>&nbsp;<br/>*Detects if the site or the web API is suffering from too many or circular redirects.*<br/>&nbsp;<br/>(i.e. **oops!** *this should not redirect clients to itself*)|120/min|&gt; 20%|&gt; 30% `1m_bad_requests`|The ratio of HTTP bad requests (4xx) over all the requests, during the last minute.<br/>&nbsp;<br/>*Detects if the site or the web API is receiving too many bad requests, including `404`, not found.*<br/>&nbsp;<br/>(i.e. **oops!** *a few files were not uploaded*)|120/min|&gt; 30%|&gt; 50% `1m_internal_errors`|The ratio of HTTP internal server errors (5xx), over all the requests, during the last minute.<br/>&nbsp;<br/>*Detects if the site is facing difficulties to serve requests.*<br/>&nbsp;<br/>(i.e. **oops!** *this release crashes too much*)|120/min|&gt; 2%|&gt; 5% `5m_requests_ratio`|The percentage of successful web requests of the last 5 minutes, compared with the previous 5 minutes.<br/>&nbsp;<br/>*Detects if the site or the web API is suddenly getting too many or too few requests.*<br/>&nbsp;<br/>(i.e. too many = **oops!** *we are under attack*)<br/>(i.e. too few = **oops!** *call the network guys*)|120/5min|&gt; double or &lt; half|&gt; 4x or &lt; 1/4x `web_slow`|The average time to respond to requests, over the last 1 minute, compared to the average of last 10 minutes.<br/>&nbsp;<br/>*Detects if the site or the web API is suddenly a lot slower.*<br/>&nbsp;<br/>(i.e. **oops!** *the database is slow again*)|120/min|&gt; 2x|&gt; 4x `1m_successful`|The ratio of successful HTTP responses (1xx, 2xx, 304) over all the requests, during the last minute.<br/>&nbsp;<br/>*Detects if the site or the web API is performing within limits.*<br/>&nbsp;<br/>(i.e. **oops!** *help us God!*)|120/min|&lt; 85%|&lt; 75% For more information check: **[the spectacles of a web server log file](https://github.com/firehol/netdata/wiki/The-spectacles-of-a-web-server-log-file)**. ## backends netdata can now archive metrics to `JSON` backends (both push, by @lfdominguez, and pull modes). ## IPMI monitoring netdata now has an IPMI plugin (based on [freeipmi](https://www.gnu.org/software/freeipmi/)) for monitoring **server hardware**. The plugin creates (up to) 8 charts, based on the information collected from IPMI: 1. number of sensors by state 2. number of events in SEL 3. Temperatures CELCIUS 4. Temperatures FAHRENHEIT 5. Voltages 6. Currents 7. Power 8. Fans It also supports alarms (including the number of sensors in **critical** state): ![image](https://cloud.githubusercontent.com/assets/2662304/23674138/88926a20-037d-11e7-89c0-20e74ee10cd1.png) For more information, check [monitoring IPMI](https://github.com/firehol/netdata/wiki/monitoring-IPMI). ## New Plugins **[Ilya Mashchenko](https://github.com/l2isbad)** builds python data collection plugins for netdata at an wonderfull rate! **He rocks!** - **web_log** for monitoring in real-time all kinds of web server log files @l2isbad - **freeipmi** for monitoring IPMI (server hardware) - **nsd** (the [name server daemon](https://www.nlnetlabs.nl/projects/nsd/)) @383c57 - **mongodb** @l2isbad - **smartd_log** (monitoring disk S.M.A.R.T. values) @l2isbad ## Improved Plugins - **nfacct** reworked and now collects connection tracker information using netlink. - **ElasticSearch** re-worked @l2isbad - **mysql** re-worked to allow faster development of custom mysql based plugins (MySQLService) @l2isbad - **SNMP** - **tomcat** @NMcCloud - **ap** (monitoring hostapd access points) - **php_fpm** @l2isbad - **postgres** @l2isbad - **isc_dhcpd** @l2isbad - **bind_rndc** @l2isbad - **numa** - **apps.plugin** improvements and freebsd support @vlvkobal - **fail2ban** @l2isbad - **freeradius** @l2isbad - **nut** (monitoring UPSes) - **tc** (Linux QoS) now works on qdiscs instead of classes for the same result (a lot faster) @t-h-e - **varnish** @l2isbad ## New and Improved Alarms - **web_log**, many alarms to detect common web site/API issues - **fping**, alarms to detect packet loss, disconnects and unusually high latency - **cpu**, cpu utilization alarm now ignores `nice` ## New and improved alarm notification methods - **HipChat** to allow hosted HipChat @frei-style - **discordapp** @lowfive ## Dashboard Improvements - dashboard now works on HiDPi screens - dashboard now shows version of netdata - dashboard now resets charts properly - dashboard updated to use latest gauge.js release ## Other Improvements - thanks to @rlefevre netdata now uses a lot of different high resolution system clocks. netdata has received a lot more improvements from many more contributors! (it was really a lot of work to dig into git log to collect all the above, so forgive me if I forgot to mention a few contributions and contributors). Thank you all! 2017-03-20T18:36:43+00:00 netdata v1.7.0 netdata v1.7.0 2017-07-16T20:12:51+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) --- This is release v1.7 of netdata. netdata is still spreading fast: we are at **320.000 users** and **132.000 servers**! Almost 100k new users, 52k new installations and 800k docker pulls since the previous release 4 and a half months ago! netdata user base grows at about **1000 new users and 600 new servers _per day_**! Thank you! You are awesome! > The next release (v1.8) will be focused on providing a **global health monitoring service**, for all netdata users, **for free**! Read more about it [here](https://github.com/firehol/netdata/issues/2466). We need supporters for this cause. Join us! ### highlights of netdata v1.7 1. netdata is now a (very fast) fully featured **statsd** server and the only one with automatic visualization: push a statsd metric and hit F5 on the netdata dashboard: your metric visualized. It also supports synthetic charts, defined by you, so that you can correlate and visualize your application the way you like it. 2. netdata got **new installation options** - it is now [easier than ever to install netdata](https://github.com/firehol/netdata/wiki/Installation#linux-one-liner) - we also distribute a **[statically linked netdata x86_64 binary](https://github.com/firehol/netdata/wiki/Installation#x86_64-pre-built-binary-for-any-linux)**, including key dependencies (like `bash`, `curl`, etc) that can run everywhere a Linux kernel runs (CoreOS, CirrOS, etc). 3. **metrics streaming and replication** has been improved significantly. All known issues have been solved and key enhancements have been added. headless collectors and proxies can now send metrics to backends when `data source = as collected`. 4. **backends** have got quite a few enhancements, including **host tags**, **metrics filtering** at the netdata side and sending of chart and dimension names instread of IDs; **[prometheus](https://github.com/firehol/netdata/wiki/Using-Netdata-with-Prometheus)** support has been re-written to utilize more prometheus features and provide more flexibility and integration options. **IF YOU UPDATE FROM NETDATA 1.6 PLEASE CHECK YOUR DASHBOARDS, SINCE MANY METRICS HAVE CHANGED NAMES**. 5. netdata now monitors **ZFS** (on Linux and FreeBSD), **ElasticSearch**, **RabbitMQ**, **Go** applications (via `expvar`), **ipfw** (on FreeBSD 11), **samba**, **squid logs** (with `web_log` plugin!). 6. netdata **dashboard loading times** have been improved significantly (hit F5 a few times on a netdata dashboard - it is now amazingly fast), to support dashboards with thousands of charts. 7. netdata alarms now support [custom hooks](https://github.com/firehol/netdata/blob/e980060b2d0724a6ea220a7a1005e2fd094db3ec/conf.d/health_alarm_notify.conf#L318-L365), so you can run whatever you like in parallel with netdata alarms. 8. As usual, this release brings dozens more improvements, enhancements and compatibility fixes. ## netdata is now a fully featured **statsd** server netdata is now a fully featured **statsd** server. It can collect statsd formatted metrics, visualize them on its dashboards, stream them to other netdata servers or archive them to backend time-series databases. netdata statsd is fast. It can collect more than 1.200.000 metrics per second on modern hardware, more than 200Mbps of sustained statsd traffic. netdata statsd is inside netdata. This provides a distributed statsd implementation. netdata also supports statsd **synthetic charts**: You can create dedicated sections on the dashboard to render the charts. You can control everything: the main menu, the submenus, the charts, the dimensions on each chart, etc. [Read more about netdata statsd](https://github.com/firehol/netdata/wiki/statsd) #### counters - Scope: **count the events of something** (e.g. number of file downloads) - Format: `name:INTEGER|c` or `name:INTEGER|C` or `name|c` - statsd increments the counter by the `INTEGER` number supplied (positive, or negative). ![image](https://cloud.githubusercontent.com/assets/2662304/26131553/4a26d19c-3aa3-11e7-94e8-c53b5ed6ebc3.png) #### gauges - Scope: **report the value of something** (e.g. cache memory used by the application server) - Format: `name:FLOAT|g` - statsd remembers the last value supplied, and can increment or decrement the latest value if `FLOAT` begins with ` + ` or ` - `. ![image](https://cloud.githubusercontent.com/assets/2662304/26131575/5d54e6f0-3aa3-11e7-9099-bc4440cd4592.png) #### histograms - Scope: **statistics on a size of events** (e.g. statistics on the sizes of files downloaded) - Format: `name:FLOAT|h` - statsd maintains a list of all the values supplied and provides statistics on them. ![image](https://cloud.githubusercontent.com/assets/2662304/26131587/704de72a-3aa3-11e7-9ea9-0d2bb778c150.png) The same chart with `sum` unselected, to show the detail of the dimensions supported: ![image](https://cloud.githubusercontent.com/assets/2662304/26131598/8076443a-3aa3-11e7-9ffa-ea535aee9c9f.png) #### meters This is identical to `counter`. - Scope: **count the events of something** (e.g. number of file downloads) - Format: `name:INTEGER|m` or `name|m` or just `name` - statsd increments the counter by the `INTEGER` number supplied (positive, or negative). ![image](https://cloud.githubusercontent.com/assets/2662304/26131605/8fdf5a06-3aa3-11e7-963f-7ecf207d1dbc.png) #### sets - Scope: **count the unique occurrences of something** (e.g. unique filenames downloaded, or unique users that downloaded files) - Format: `name:TEXT|s` - statsd maintains a unique index of all values supplied, and reports the unique entries in it. ![image](https://cloud.githubusercontent.com/assets/2662304/26131612/9eaa7b1a-3aa3-11e7-903b-d881e9a35be2.png) #### timers - Scope: **statistics on the duration of events** (e.g. statistics for the duration of file downloads) - Format: `name:FLOAT|ms` - statsd maintains a list of all the values supplied and provides statistics on them. ![image](https://cloud.githubusercontent.com/assets/2662304/26131620/acbea6a4-3aa3-11e7-8bdd-4a8996847767.png) The same chart with the `sum` unselected: ![image](https://cloud.githubusercontent.com/assets/2662304/26131629/bc34f2d2-3aa3-11e7-8a07-f2fc94ba4352.png) --- ## dashboard improvements There have been significant optimizations to the loading times of the dashboard. The dashboard loads instantly now, even when there are several hundreds of charts in it (hit F5 on the dashboard - it is super fast). For those who know: we eliminated most browser reflows, by refactoring the way the charts are initialized and splitting initialization in 2 phases. Unfortunately we had to re-shape gauge and easypiecharts, so pay some attention to your custom dashboards after updating. We now use **natural sorting** on the dashboard elements (i.e. instead of 1, 10, 2, 3 we get 1, 2, 3, 10). There have been dozens of performance improvements on the netdata dashboard. Like all the previous releases, **this release makes netdata the fastest netdata** so far! ## new installation methods - Single line installation on Linux - Static 64bit packages for Linux - Improved support for Red Hat Enterprise Linux @racciari, - Improved support for Amazon Machine Image - Improved support for Centos @n0coast - Many more installer/updater improvements @nielsAD, @mfurlend ## Streaming - improved self cleanup of obsolete charts and hosts at a central netdata. - **host tags** are now propagated from netdata to netdata while streaming metrics. - log error when multiple clients are streaming the metrics of the same host. - dozens more streaming improvements and bugfixes. ## Backends - New **[prometheus](https://github.com/firehol/netdata/wiki/Using-Netdata-with-Prometheus)** backend, supporting all the features of the others backends netdata supports. The new format changed the names of metrics, so if you use grafana or other tools you will have to update your queries. - Prometheus and opentsdb now support **host tags** (advanced ephemeral nodes monitoring) - Metrics sent to backends with data source `average`, `sum` or `volume` (from the netdata database) are now more accurate. - Added `contrib/nc-backend.sh`, a script that can act as a fallback backend for graphite, opentsdb and compatibles. - netdata nodes without a database (slaves and proxies) can now send `as collected` metrics to backends. ## New and improved plugins - **Go** apps monitoring via `expvar` ! @kralewitz - **ElasticSearch** monitoring ! @l2isbad - **RabbitMQ** monitoring ! @l2isbad - **ipfw** monitoring under FreeBSD 11 ! @vlvkobal - **ZFS** monitoring under FreeBSD (@vlvkobal) and Linux ! - **samba** monitoring ! @ntlug - `web_log` plugin can now monitor **squid logs** too ! @l2isbad - `web_log` plugin can now monitor **apache cache logs** too (removed old `apache_cache` plugin) @l2isbad - many more `web_log` improvements - `web_log` is now a lot more powerful! @l2isbad - `python.d.plugin` `LogService` now supports monitoring web log files matching a pattern @l2isbad - disk monitoring under Linux now utilizes `/dev/mapper` names. It also has improved docker compatibility. - `haproxy` improvements @l2isbad - `dns_query_time` plugin to monitor the response time of nameservers @l2isbad - Fronius Solar @BrainDoctor - better support for monitoring Proxmox/qemu @efaden and libvirt/qemu VMs - `cpufreq` improvements @l2isbad - `smartd_log` improvements @pkoenig10 - `bind_rndc` rewritten @l2isbad - `lighttpd` improvements (part of the `apache` plugin) - `isc_dhcpd` improvements @l2isbad - `fping` improvements - `apps.plugin` improvements (added many more applications to monitor, notably **hadoop** and friends, improved compatibility) - `freeipmi` improvements - `mdstat` improvements @l2isbad - `mysql` improvements @alibo - `redis` improvements @l2isbad - `postgres` rds fixes @facetoe - `fail2ban` improvements @l2isbad - `idlejitter` rewritten - `openvpn` improvements @l2isbad - `numa` improvements @Benje06 ## New and improved alarms - `alarm-notify.sh` now supports [custom notification methods](https://github.com/firehol/netdata/blob/e980060b2d0724a6ea220a7a1005e2fd094db3ec/conf.d/health_alarm_notify.conf#L318-L365) (you can hook whatever you like to netdata alarms). - email notifications are now multipart (have both HTML and text versions in them) - low memory alarm now excludes ZFS ARC. - improved discord notifications. - improved telegraf notifications @alibo - `lighttpd` alarm - `mongodb` alarm @jnogol ## Other improvements - memory mode `ram` utilizes KSM (kernel memory deduper). - many memory mode `map` improvements for faster operation with huge databases. - netdata is now even faster on FreeBSD, thank to several optimization made by @vlvkobal - netdata can now be compiled with `clang`, even on FreeBSD - netdata can now be compiled on FreeBSD 10.3 2017-07-16T20:12:51+00:00 netdata v1.8.0 netdata v1.8.0 2017-09-17T17:07:02+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) --- netdata v1.8.0 released. This release focuses on metrics streaming improvements and containers monitoring. As always, this netdata is the fastest and the more stable netdata ever! **Update now!** [To install or update netdata, click here](https://github.com/firehol/netdata/wiki/Installation)! ## key streaming improvements #### bug fix: streaming slaves consuming 100% CPU netdata, as a slave, was not handling all the error cases properly, resulting in 100% cpu utilization of a single core, under certain conditions. Especially under FreeBSD and macOS slaves, these conditions were always met, so using FreeBSD or macOS as netdata slaves, was completely broken. #### bug fix: missing alarm notifications on netdata masters netdata was incorrectly messing cached alarm state data between the alarms of the mirrored hosts, resulting in alarm notifications not dispatched under certain conditions. This was affecting only netdata masters (ie. netdata servers with more than one host databases, with health monitoring enabled). The alarms were generated and were visible at the dashboards, but the notifications were not always sent. #### bug fix: streamed charts with duplicate names There was a minor issue with charts that were created with name aliases. When these charts were streamed from netdata slaves to netdata masters, they ended up with duplicate chart names (ie instead of `type.name` they had `type.type.name`). --- ## key containers monitoring improvements - **Container network interfaces are now moved to the container section** and they are rendered from the container view point (i.e. `sent` = what the container sent) - no more `veth*` garbage on the dashboard. - The interfaces also appear as `eth0` (or whatever the container sees) and they are inside the container section of the dashboard. netdata maps each `veth*` interface to the right container, using plain `cgroups` features, so this works for all container managers (docker, lxc, etc). - Eliminated the nested containers shown under certain versions of `lxc`. - Also, containers and VMs now have summary gauges on the dashboard ![image](https://user-images.githubusercontent.com/2662304/30249219-3e032246-9640-11e7-9f37-4d78f280a74d.png) --- ## key plugins improvements #### python.d.plugin now supports HTTP keep-alive netdata now uses `urllib3` (shipped with netdata for both python v2 and v3) for URLService based plugins. This enables HTTP `keep-alive` on all connections, which allows netdata to have permanent connections to third party web applications. Fixed by @l2isbad --- ## compatibility enhancements - better support for Oracle Linux, by @schindlerd - better support for Alpine Linux - various fixes at the build procedure for macOS - `fping` can now run as non-root, in static binary netdata packages ## netdata generic enhancements - netdata can now listen on UNIX domain sockets (`.sock` files). This allows a local web server and netdata to communicate bypassing the network stack (for netdata set `bind to = unix:/path/to/netdata.sock` - this option supports multiple arguments, so netdata can listen to multiple unix sockets and tcp sockets, at the same time). - netdata was assuming that the JSON representation of a chart would at most be 1024 bytes, and it was generating **corrupted JSON** output when any chart was exceeding that limit. Removed the limitation (ie. now there is no limit). - netdata was crashing while starting, if **no usable disks were found**. - systemd `netdata.service` now allows setting negative netdata OOM score and restarts netdata if it crashes. The new `netdata.service` is not automatically installed when updating netdata. Either delete `/etc/systemd/system/netdata.service` and then update/re-install netdata, or copy the file by hand. - minor fixes at the installer, by @vincele --- ## new plugins - Added Intel CPU temperature charts on FreeBSD and macOS, by @vlvkobal - Added CPU thermal throttling charts on Linux (useful on physical servers and possibly laptops) - Added `chrony` plugin, by @domschl - Added [Stiebel Eltron](https://github.com/firehol/netdata/blob/master/node.d/README.md#stiebel-eltron) plugin to collect metrics from heat pumps and hot water installations from Stiebel Eltron ISG @BrainDoctor ## improved plugins - `web_log` bugfixes, enhancements and optimizations (including `squid` logs), by @l2isbad - `web_log` now enables parsing HTTP/2 logs in `custom_log_format`, by @Funzinator - `redis` bugfixes, by @l2isbad - `haproxy` bugfixes, by @l2isbad - `elasticsearch` bugfixes and optimizations, by @l2isbad - `rabbitmq` bugfixes and optimizations, by @l2isbad - `mdstat` bugfixes, by @JeffHenson - `tomcat` improvements, by @Wing924 - `mysql` improvements, by @alibo and @l2isbad - `dovecot` improvements - `postgres` improvements, by @facetoe - `cpufreq` fixed a bug that prevented `accurate` reporting of CPU frequencies. `accurate` works with the `acpi-cpufreq` driver and calculates the average CPU clock of the CPUs utilizing the accounting per frequency, as reported by the kernel, by @tycho - `cpuidle` performance improvements (faster under load) by @tycho - `fail2ban` bugfixes, by @l2isbad - `SNMP` plugin new uses latest `net-snmp` and the corrupted 64 bit counters encountered under certain node.js version is now fixed. --- ## dashboard improvements - `easypiecharts` and `gauges` can now render arbitrary ranges and animate clock wise or counter clock wise. - traditionally netdata was using 1024 bits = 1 kilobit. It is fixed: 1000 bits = 1 kilobit. - netdata charts should now work on wordpress pages. --- ## alarms and notifications - `alarm-notify.sh` now supports debug mode, showing the exact commands it runs to send notifications, when `export NETDATA_ALARM_NOTIFY_DEBUG=1` - `alarm-notify.sh` now supports setting the sender email address of the emails it sends. - emails sent by `alarm-notify.sh` now include headers to reduce the possibility of them being scored as spam, by @Ferroin - network related alarms got new thresholds and improved badges - netdata now detects if the system has been suspended and pauses all alarms for 60 seconds on resume, to prevent false alarms (no more false alarms on laptops when they resume). - netdata alarms now support filtering based on hostname and O/S (linux, freebsd, macos). This means that netdata masters, can now support alarms for slaves of any O/S (i.e. a Linux netdata master can handle alarms for a FreeBSD slave). - netdata slack notifications now show the host sent the alarm. In the image below, the alarm is about `bangalore`, and is sent by `netdata-build-server` (at the lower left corner): ![image](https://user-images.githubusercontent.com/2662304/30249624-55a62bf2-9648-11e7-8e85-730395a8e0bf.png) --- ## statsd - the number of fractional points supported by statsd is now configurable (1 to 7). - 95th percentile calculation on statsd histograms and timers, was incorrectly averaging the values. It is now fixed. - statsd metrics with non ASCII text were processed by the statsd server, but were breaking JSON data generated by netdata. Fixed it by replacing all invalid characters. 2017-09-17T17:07:02+00:00 netdata v1.9.0 netdata v1.9.0 2017-12-16T23:22:02+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) --- # Overview of netdata v1.9 1. **snapshots** We can now save and load dashboard snapshots for any timeframe in any resolution. snapshots allow us to save artifacts, evidence, documentation of incidents, or just the raw data for postmortem analysis. 2. **highlighted time-frame** We can now highlight a selected time-frame on all dashboard charts. So, to quickly compare charts press ALT or CONTROL and select an area on one chart. The same area will be highlighted on all charts. 3. **export to PDF** We can now export netdata dashboards to PDF, for any timeframe with any detail. 4. **access lists** (IP filtering) We can now setup IP filtering at `netdata.conf` for all functions of netdata (dashboard access, streaming, registry, badges, etc - no more iptables rules for protecting netdata). 5. **TCP overflows and connection drops** netdata can now detect TCP listening sockets overflows and connection drops, for any server running on the host (even the ones netdata is not aware of). 6. **libvirt VMs** netdata now detects **libvirt** network interfaces and moves them to VM section of the dashboard (it also supports `.libvirt-qemu` naming of cgroups). 7. **Units auto-scaling** netdata dashboards can now **scale units** (`KB` -> `MB` -> `GB` -> `TB`, etc), on the fly. 8. **Units conversions** netdata dashboards can now **convert units** (eg. Celsius to Fahrenheit, seconds to HH:MM:DD, etc), on the fly. 9. **Multiple Timezones** netdata dashboards can now **change timezone** on the fly (yes, we can now compare charts with server logs). 10. **python.d.plugin** rewritten @l2isbad rewrote the whole of it, to add flexibility and support the latest netdata features! The new plugin supports the old python modules. 11. **better / faster dashboard scrolling** netdata now uses passive event listeners to detect page scrolling. This improved significantly the responsiveness of the dashboard (check your dashboard settings: `sync` scrolling is the fastest, `async` is closer to the older behavior). 12. netdata now monitors **couchdb**, **powerdns**, **beanstalkd** and **dnsdist** ! 13. netdata now detects **redis** background save failures 14. netdata can now send **flock.com** and **kavenegar.com** alarm notifications and as always... dozens more improvements, enhancements, new features and bug fixes! --- ## netdata dashboard snapshots ! Netdata can now export and import dashboard **snapshots**. Snapshots are JSON files containing everything the dashboard needs to be rendered: charts and chart data. They are exported as JSON files, to your computer. The saved snapshots can be loaded back on any netdata dashboard (even of different host). When importing, not network traffic is generated. The web browser loads the local file and renders an interactive dashboard to examine it. The current visible timeframe of the dashboard is respected, so first align the dashboard to the timeframe required and the click "Export". The pop-up allows selecting the resolution of the export (its detail). ![peek 2017-11-13 13-13](https://user-images.githubusercontent.com/2662304/32723039-a89d6d62-c874-11e7-9735-3f576b2b215c.gif) --- ## highlighted time-frame ! Press the ALT or CONTROL key and select a time-frame at a chart. An overlay will appear with the selected time-frame and all the charts will highlight the same region. The highlighted time-frame: 1. Is added to the URL hash, so that reloading the page keeps it 2. Is propagated to other netdata servers, via the `my-netdata` menu 3. Is save in dashboard snapshots (and of course restored when they are loaded back) ![peek 2017-11-19 19-39](https://user-images.githubusercontent.com/2662304/32993483-78f8f148-cd61-11e7-9dbb-dab082ec231a.gif) Also, netdata charts can now be zoomed vertically (use the SHIFT key, like in zoom, but select the chart vertically): ![peek 2017-11-19 20-10](https://user-images.githubusercontent.com/2662304/32993790-bf47b8d8-cd65-11e7-8f4a-c92d48e8a454.gif) --- ## netdata dashboards to PDF ! netdata dashboards can now be printed to PDF. Just click the :printer: icon on the dashboard. The current visible timeframe of the dashboard is respected, so first align the dashboard to the timeframe required and the click "Print". ![peek 2017-11-11 19-55](https://user-images.githubusercontent.com/2662304/32692083-8db522de-c71a-11e7-8c17-7d731cdf0945.gif) --- ## netdata now supports API access lists (IP filtering) netdata can now check the client IPs connecting to it and deny/allow access based on your settings. No more iptables rules to control access to netdata. All these settings are [netdata simple patterns](https://github.com/firehol/netdata/wiki/Configuration#netdata-simple-patterns) that are checked against the client IP (string matching - not subnet matching). localhost clients (IPv4, IPv6 and unix domain sockets) can be matched with `localhost`: #### Global access control - `[web].allow connections from` to match the clients' IPs allowed to connect to netdata. This has the same effect with iptables (but implemented at the application level - so clients will get connected, and disconnected immediately if they are not allowed access, without any response from netdata). #### Dashboard access control - `netdata.conf`: `[web].allow dashboard from` to match the clients' IPs that are allowed to access the dashboard (ie fetch static files and query netdata API). - `netdata.conf`: `[web].allow badges from` to match the clients' IPs that are allowed to access badges (the dashboard clients are allowed to access badges too, so this setting allows badges to clients that do not have access to the dashboard). #### Streaming access control - `netdata.conf`: `[web].allow streaming from` to match the the clients' IPs that are allowed to stream to stream metrics. - `stream.conf`: `[API_KEY].allow from` to match the clients' IPs allowed to push metrics for the given API KEY. - `stream.conf`: `[MACHINE_GUID].allow from` to match the clients' IPs allowed to push metrics for the specific machine. netdata will also check the API keys supplied by slaves and proxies connected. #### Other access lists - `netdata.conf`: `[web].allow netdata.conf from` to limit the clients that can get `netdata.conf` - by default netdata allows only private IPs. - `netdata.conf`: `[registry].allow from` to limit the clients allowed to access the registry (only when this netdata acts as a registry). --- ## netdata detects TCP listening sockets overflowing or dropping connections Added a new chart: `ipv4.tcplistenissues` with dimensions `ListenOverflows` and `ListenDrops`. > This chart detects if any listening TCP socket on the host, is overflown, or it drops connections. This is system-wide: any listening TCP socket, of any application. The chart will not be shown if these kernel counters are zero. It will be enabled automatically if it is found non-zero at any point (it is collected via `/proc/net/netstat` every second). If you need to enable it even if it is zero, edit netdata.conf and set: ``` [plugin:proc:/proc/net/netstat] TCP listen issues = yes ``` Two alarms have been added, one for `ListenOverflows` and one for `ListenDrops` that detect if there is any overflow or drop in the last minute (they run every 10 seconds). slack alarm for overflows: ![image](https://user-images.githubusercontent.com/2662304/31415368-b547d524-ae2b-11e7-895a-8b2f87b8ec81.png) slack alarm for drops: ![image](https://user-images.githubusercontent.com/2662304/31415446-3980c13e-ae2c-11e7-9635-f1edfff7cc50.png) and the alarms configuration: ![screenshot from 2017-10-09 23-04-05](https://user-images.githubusercontent.com/2662304/31356299-3e42ef12-ad46-11e7-9bf5-007ba57b3755.png) The alarms will automatically be attached when the chart is active. The overflows dimension and alarm is supported on FreeBSD too. ## `/proc/net/sockstat` and `/proc/net/sockstat6` These files provide sockets statistics for all protocols. ![screenshot from 2017-11-07 02-39-37](https://user-images.githubusercontent.com/2662304/32471232-ebdb905a-c364-11e7-9d2f-a472de516315.png) netdata also adds 3 new alarms: 1. too many tcp orphan sockets 2. tcp memory that detects that the tcp stack is under memory pressure or close to giving memory errors 3. too many tcp connections (for kernels that do not support dynamic allocation of connections) --- ## Streaming - netdata proxies with more than 100 slaves, had a timing issue that caused them to crash randomly on slave reconnects. Parts of the code have been rewritten to get rid of the timing issue. - netdata slaves and proxies, now have a protection that ensures they will never use 100% CPU, even if the master is misbehaving. - expired orphaned hosts are now removed from the `my-netdata` menu of the dashboard. - streaming functions can now be monitored via `access.log` - streaming now support **IP filtering**. So the entire streaming functionality, API keys and MACHINE GUIDs can be associated with one or more IPs or IP patterns. - streaming now transfers alarm variables too --- ## python.d.plugin rewritten @l2isbad did a marvelous job rewriting `python.d.plugin`. The new plugin: 1. supports option `autodetection_retry: SECONDS`. When set to non-zero, the plugin will re-check the module every that many seconds. This solves the problem that netdata did not persist on collecting metrics from applications, if the application is not found running when netdata starts. By default is zero for all modules, so you need to enable it for all the applications you need it. 2. got a rewrite of several functions, like logging, module configuration, chart and dimensions management. 3. the new URL service disables by default certificates checks, to allow self-signed certificates to work without configuration. The new plugin is compatible with custom python modules developed for the previous version. --- ## web_log plugin - custom regex now supports parsing hostnames and IPs @l2isbad - web_log now parses lines with error 408 (request timeout - these are a special case, since the request has not received by the web server, so the log line is incomplete) @l2isbad - now properly parses `resp_length` with value `-` @racciari --- ## couchdb monitoring CouchDB maintainer @wohali, submitted a couchdb plugin for netdata. The plugin monitors: - database activity - http response codes - server operations - per DB statistics ![mwsnap 2017-09-29 22_54_33](https://user-images.githubusercontent.com/112292/31041874-7463c37a-a56a-11e7-9d10-3ce969b983ea.png) ![mwsnap 2017-09-29 22_54_44](https://user-images.githubusercontent.com/112292/31041875-74651f40-a56a-11e7-9181-5771d1de9fef.png) --- ## redis monitoring 2 charts have been added to monitor background save health status, bundled with 2 alarms that detect if background save has failed, or background save is slow (warn > 10 mins, crit > 20min). @l2isbad ![screenshot_20170925_092235](https://user-images.githubusercontent.com/22274335/30788234-ba213432-a1d3-11e7-8f94-a9452ed94e3d.png) --- ## Other new and enhanced plugins - netdata now monitors [PowerDNS](https://www.powerdns.com/), @l2isbad - netdata now monitors [beanstalkd](http://kr.github.io/beanstalkd/), @l2isbad - netdata now monitors [dnsdist](https://dnsdist.org/), @nobody-nobody - disks under Linux are renamed using `/dev/disk/by-label`. An option has been added at netdata.conf to also allow renaming based on `/dev/disk/by-id`. - `chrony` is now disabled by default, because there have been reports that `chronyc` enters an infinite loop in CentOS and RHEL. - `tomcat` improvements to support flavors of the tomcat server @Wing924 - `zfs` on FreeBSD now monitors ZFS TRIM statistics - disks monitoring charts on FreeBSD got a lot more FreeBSD related dimensions. - added CPU frequency charts on FreeBSD (Linux already had them). - chart `system.io` (the total system Disk I/O) is now calculated by aggregating the reads and writes of all physical disks. The previous `system.io` chart (that is based on `pgpgin` and `pgpgout` from `/proc/vmstat`) is now named `system.pgpgio`. The key difference is that the new `system.io` now sees ZFS I/O, and it also correctly and accurately sums the real disk bandwidth of RAID arrays. - chart `system.net` (the total system network bandwidth) is now calculated by aggregating the bandwidth of all physical network interfaces and is common for both IPv4 and IPv6. - `tc` (QoS) charts now sort the dimensions on the legends, the same way `tc` reports them. - `postgres` versions <= 10 the WAL directory was named `pg_xlog'` and from 10 upwards has been renamed to `pg_wal` @facetoe - `mysql` (and mariadb) got new charts for galera replication @spinitron - `openvpn_log` improvements @l2isbad - `smartd` improvements @l2isbad - `varnish` module has been rewritten @l2isbad - `mdstat` regex fix @l2isbad - `smartd_log` improvements @l2isbad - `dns_query_time` improvements @wungad - `isc_dhcpd` improvements @wungad - `freeipmi.plugin` got a command line option (can be given at netdata.conf) to ignore certain sensor IDs that are faulty. - `freeradius` improvements @wungad - `node.d.plugin` bugfixes ## Plugins protocol enhancements - netdata now supports multiple plugin directories. The setting is the same in `netdata.conf`, `plugins directory = "DIRECTORY1" "DIRECTORY2" ...`, up to 20 directories. By default netdata sets: ``` [global] plugins directory = "/usr/libexec/netdata/plugins.d" "/etc/netdata/custom-plugins.d" ``` - netdata now supports **alarms variables**. Each plugin can now define **host global** and **chart local** variables with static values, that can be used in alarms' expressions. So, hosts and charts can now have any number of static values associated with them (eg. an application server may expose its max connections limit), and these static values can be used to trigger alarms (eg. the current connections, is compared to the max connections variable). The whole setup allows alarm templates to use this feature (eg each netdata can maintain different such variables for each server it monitors). Alarm variables are propagated to upstream netdata servers. --- ## O/S - distro support - added init file for SLC 6.9 and CloudLinux Server release 6.9 - packages installer was incorrectly detecting all python versions as version 2. - a `makeself` bug that prevented the static netdata binaries from being installed on `busybox` systems, has been fixed. - openrc startup script (gentoo, alpine) had hardcoded the path to netdata. This affected all static-64bit builds when installed on these distros. Fixed. - the static 64bit installer now downloads netdata.conf, much like the git installer does. - openrc / gentoo init improvements @candrews - enabled support for macOS versions 10.5+ (10.11 was working already) @vlvkobal - enabled support for FreeBSD 12 @vlvkobal - fixed a crash on macOS hosts with empty disk names. - added `Dockerfile.armv7hf` for running netdata under docker on ARM v7 machines @justin8 --- ## Dashboard improvements - hover selection of charts is now faster on all browsers. Perfect on Chrome, Firefox and Opera. Quite usable on Edge. - the dashboard is now fixed when a modal is open, preventing scrolling the page. - the dashboard now uses fontawesome 5.0.1 for icons. - the chart names can now be searched with browser control-F (find in page). netdata lazy loads all charts for it was impossible to search of a chart. Now the charts are searchable. This is important on dashboards with several hundreds of statsd charts, because all these charts appear under the same section. - netdata now detects **libvirt** VM network interfaces and moves them to the VM section of the dashboard. The same functionality already exists for containers. ![screenshot from 2017-10-31 01-32-43](https://user-images.githubusercontent.com/2662304/32200665-7c2b2d1c-bddb-11e7-98a8-c027578b2b0a.png) - Show the context of each chart. The `context` is used in alarm templates. (hover on the date of the chart) ![image](https://user-images.githubusercontent.com/2662304/31103168-529d9a1a-a7de-11e7-8257-5f08d93083ba.png) - Show the resolution of the chart. (hover on the time of the chart) ![image](https://user-images.githubusercontent.com/2662304/31103223-7deaeaec-a7de-11e7-8470-f62191ec9303.png) - The dashboard now adds a tooltip at the date of the charts, to show the plugin and its module that collects each chart. - The dashboard should now put a lot less CPU pressure on the browser when the page does not have focus. #### automatic units scaling The dashboard does dynamic units scaling, **on the fly** ! It converts: - network bandwidth (`kilobits/s` to `megabits/s` or `gigabits/s`) - input/output bandwidth (`kilobytes/s` to `megabytes/s` or `gigabytes/s`, similarly for `KB/s`) - memory sizes (`MB` to `KB`, `GB` or `TB`) - disk sizes (`GB` to `MB` or `TB`) Chart units dynamically adapt based on the value of the selected dimension too: ![peek 2017-10-06 22-58](https://user-images.githubusercontent.com/2662304/31296227-309a4c06-aaea-11e7-8c7b-3bca6a954372.gif) Custom dashboards can give `data-desired-units="UNITS"` and netdata will automatically convert the presented values to the desired units. `UNITS` can be any of the supported one, or `auto` for auto-scaling based on the values, or `original` to show the original units maintained by the netdata server. #### units conversions The dashboard now supports units conversions. Currently it converts: temperatures from `Celsius` to `Fahrenheit` ![image](https://user-images.githubusercontent.com/2662304/31411328-ef7558a0-ae19-11e7-9905-dbb95b261d12.png) `seconds` to human readable duration `DDd:HH:MM:SS` ![image](https://user-images.githubusercontent.com/2662304/31411264-c1e68454-ae19-11e7-9870-2bd633443545.png) #### timezone conversions netdata can now convert all dates presented to any timezone. Traditionally netdata presented all charts at the timezone of the viewer. This allowed homogeneous central administration of systems that are installed all over the world. However, this was inefficient when we needed to compare the information presented on the dashboard, with the log files of the servers. So, now netdata can present the charts on any timezone. The netdata server auto-detects the timezone of the server and new dashboard settings have been added to allow this conversion. If autodetection of the servers timezone fails, the configuration option `[global].timezone` has been added in `netdata.conf` to set it. Also, the dashboard itself allows the viewers to configure the timezone (it is saved at browser local storage, so this has to be set just once per viewer). #### new dashboard options To support all the above, the dashboard settings got a new tab, with all the required options: ![screenshot from 2017-10-10 23-54-01](https://user-images.githubusercontent.com/2662304/31409982-587d96f4-ae16-11e7-8cab-7b66d6b0eb6e.png) --- ## statsd improvements - statsd metrics can now be added to statsd synthetic charts using patterns. No need to add a `dimension` line for each statsd metric to be added. netdata will also extract the wildcarded part of the metric name and use that one for the dimension name. - dimensions added to statsd synthetic charts, can automatically be renamed using a dictionary. Each synthetic charts application has its own dictionary of name - value pairs, which is used to automatically rename statsd metrics when they are added to synthetic charts. - statsd timers and histograms now report zeros when nothing is collected --- ## Badges improvements - fixed a bug in netdata badges that was incorrectly matching zero values with the `null` color condition. - added API option `display_absolute` to allow badges use the signed value for color evaluation, but present the absolute value. --- ## Other Alarm and Alarm Notifications Improvements - warning emails sent by netdata, are now a little bit more orange (they were a bit green'sh). - added flock.com notifications @tvarsis - added kavenegar.com support for SMS notifications @vahit - fixed a bug in email notifications that was triggering a corrupted MIME match by anti-spam solutions. - pushbullet notifications now track the devices, so that per device filtering at pushbullet is possible. Also improved the formatting a bit. @user501254 - pushover notifications fixes (the priority of warnings was set incorrectly) - alarms can now use variables like this `${variable with spaces or +, -, *, / in it}`. So, alarms can now use dimension names with any character in them. --- ## Other Improvements - `access.log` has been refactored to support monitoring all netdata operations - inodes monitoring is now by default disabled for mount points based on filesystems that do not have a maximum inode threshold (such as `cephfs`). - `rabbitmq` has been added to `apps_groups.conf` so that `apps.plugin` now monitors (cpu, memory, disk I/O, sockets, etc) for rabbitmq instances. - several email and log management apps have been added to `email` and `logs` targets of `apps_groups.conf`, @Flums - `ceph` target added to `apps_groups.conf` to allow netdata monitor [Ceph](//ceph.com/) - the unified, distributed storage system, @k0ste - refactored several internal data collection plugins to eliminate a few hundreds of index lookups per second. - `netdata.conf` settings that are loaded from disk, but were the same with the default ones, were generated commented when the server was asked to give its config. Now all loaded settings are generated uncommented. - netdata simple patterns can now extract the the wildcarded part of the string they match (used in statsd synthetic charts) - netdata simple patterns can allow escaping spaces by prefixing them with a backslash. 2017-12-16T23:22:02+00:00 netdata v1.10.0 netdata v1.10.0 2018-03-27T20:52:12+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) --- Posted on [twitter](https://twitter.com/linuxnetdata/status/978741630005141504), [facebook](https://www.facebook.com/linuxnetdata/), [reddit r/linux](https://www.reddit.com/r/linux/comments/87mhau/netdata_the_opensource_realtime_performance_and/), --- Hi all, Another great netdata release: **netdata v1.10.0** ! This is a birthday release: **netdata is now 2 years old** ! Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful for thousands of admins, devops and developers around the world! You rock! \- @ktsaou ## At a glance netdata now has a new web server (called `static`) with a fixed number of threads, providing a lot better performance and finer control of the resources allocated to it. All dashboard elements (javascript) have been updated to their latest versions - this allows a smoother experience when embedding netdata charts on third party web sites and apps. --- > IMPORTANT: **all users using older netdata are advised to update to this version**. This version offers improved stability, security and a huge number of bug fixes, compared to any prior version of netdata. --- #### new plugins - **BTRFS** - monitor the allocations of BTRFS filesystems (yes, netdata can now properly detect when btrfs is going out of space) - **BCACHE** - monitor the caching block layer that allows building hybrid disks using normal HDDs and SSDs - **Ceph** - monitor ceph distributed storage - **nginx plus** - monitor the nginx+ web servers - **libreswan** - monitor IPSEC tunnels - **Traefik** - monitor traefik reverse proxies - **icecast** - monitor icecast streaming servers - **ntpd** - monitor NTP servers - **httpcheck** - monitor any remote web server - **portcheck** - monitor any remote TCP port - **spring-boot** - monitor java spring boot applications - **dnsdist** - monitor dnsdist name servers - **hugepages** - monitor the allocation of Linux hugepages #### enhanced / improved plugins - statsd - web_log - containers monitoring - system memory - diskspace - network interfaces - postgres - rabbitmq - apps.plugin - haproxy - uptime - ksm - mdstat - elasticsearch - apcupsd - isc-dhcpd - fronius - stiebeleltron #### new alarm notifications methods - alerta - IRC And as always, hundreds more enhancements, improvements and bugfixes. --- ## BTRFS monitoring **BTRFS** space usage monitoring and related alarms. netdata is able to detect if any of the space-related components (physical disk allocation, data, metdata and system) of BTRFS is about the become exhausted! [#3150](https://github.com/firehol/netdata/pull/3150) - thanks to @Ferroin for explaining everything about btrfs... ![screenshot from 2017-12-19 01-15-38](https://user-images.githubusercontent.com/2662304/34132777-28bb0b52-e45a-11e7-806d-f9b0e791c4ec.png) ## bcache monitoring netdata now monitors bcache metrics - they are automatically added to any disk that is found to be a bcache disk. ## ceph monitoring New plugin to monitor [ceph](https://ceph.com/), the unified, distributed storage system designed for excellent performance, reliability and scalability ([#3166](https://github.com/firehol/netdata/pull/3166) @lets00). ## containers and VMs monitoring - netdata now monitors `systemd-nspawn` containers. - netdata now renames charts of kubernetes containers. - `virsh` is now called with `-r` to avoid prompting for password [#3144](https://github.com/firehol/netdata/issues/3144) - `cgroup-network` is now a lot more strict, preventing unauthorized privilege escalation [#3269](https://github.com/firehol/netdata/issues/3269) - `cgroup-network` now searches for container processes in sub-cgroups too - this improves the mapping of network interfaces to containers - `cgroup-network` now works even when there are no `veth` interfaces in the system ## monitor ntpd netdata can now monitor isc-ntpd. @rda0 did a marvelous job decoding NTP Control Message Protocol, collecting ntpd metrics in the most efficient way [#3421](https://github.com/firehol/netdata/pull/3421), [#3454](https://github.com/firehol/netdata/pull/3454) @rda0 ![ntpd_system](https://user-images.githubusercontent.com/2662304/36447005-bbdfcd06-168b-11e8-8a28-0f5c118a04ea.png) > btw, netdata also monitors `chrony` but the chrony module of netdata is disabled by default, because certain CentOS versions ship a version of chrony that consumes 100% cpu when queried for statistics. ## nginx plus web servers monitoring Added python plugin to monitor the operation of nginx plus servers. The plugin monitors everything about nginx+, except streaming [#3312](https://github.com/firehol/netdata/pull/3312) @l2isbad ## libreswan IPSEC tunnels monitoring netdata now monitors libreswan tunnels - [#3204](https://github.com/firehol/netdata/pull/3204) ![screenshot from 2018-01-03 00-32-14](https://user-images.githubusercontent.com/2662304/34502742-95a0b928-f01d-11e7-83bb-ae8fc8182a1a.png) ## remote HTTP/HTTPS server monitoring netdata now has an `httpcheck` plugin (module of python.d.plugin), that can query remote http/https servers, track the response timings and check that the response body contains certain text [#3448](https://github.com/firehol/netdata/pull/3448) @ccremer . ![httpcheck](https://user-images.githubusercontent.com/12159026/36447709-77c2f400-1685-11e8-941a-07ac7c7f10d8.png) ## remote TCP port monitoring netdata now has `portcheck` plugin (module of python.d.plugin), that can check any remote TCP port is open [#3447](https://github.com/firehol/netdata/pull/3447) @ccremer ![portcheck](https://user-images.githubusercontent.com/12159026/36447742-8cfe9086-1685-11e8-83b0-0f1bf64bd64f.png) ## icecast streaming server monitoring netdata now monitors icecast servers [#3511](https://github.com/firehol/netdata/pull/3511) @l2isbad. ## traefik reverse proxy monitoring netdata now monitors traefik reverse proxies - [#3557](https://github.com/firehol/netdata/pull/3557). ## spring-boot monitoring netdata can now monitor java [spring-boot](https://projects.spring.io/spring-boot/) applications @Wing924 ![2018-02-23 11 34 37](https://user-images.githubusercontent.com/8895721/36575302-a78044fe-188d-11e8-88a5-bc26b0779bfb.png) ![2018-02-23 11 34 48](https://user-images.githubusercontent.com/8895721/36575303-a7b10850-188d-11e8-91c8-6a6d417be943.png) ## dnsdist netdata now monitors dnsdist name servers - @nobody-nobody [#3009](https://github.com/firehol/netdata/pull/3009) ## statsd - statsd dimensions now support the options the external plugin dimensions support (currently the only usable option is `hidden` to add the dimension, but make it hidden on the dashboard - a hidden dimension can participate in various calculations, including alarms). - statsd now reports the CPU usage of its threads at the netdata section. - statsd metrics are logged to access.log the first time they are encountered. - statsd metrics now accept the special value `zinit` to allow them get initialized without altering their values (this is useful if you have rare metrics that you need to initialize when netdata starts). - statsd over TCP is now a lot faster - netdata can process up to 3.5mil statsd metrics / second using just one core. Added options to control the timeouts of TCP statsd connections. - fixed the title and context of statsd private charts - statsd private charts can now be hidden from the dashboard [#3467](https://github.com/firehol/netdata/pull/3467) ## postgres Several new charts have been added to monitor ([#3400](https://github.com/firehol/netdata/pull/3400) by @anayrat): 1. checkpointer charts 2. bgwriter charts 3. autovacuum charts 4. replication delta charts 5. WAL archive charts 6. WAL charts 7. temporary files charts Also, the postgres plugin now also works when postgres is in recovery mode. ## rabbitmq - added Erlang run queue chart. This is useful in conjunction with the existing Erlang processes chart to get a better overall idea of what's going on in the Erlang VM. @arch273 - added rabbitmq information on the dashboard to complement the charts. ## apps.plugin netdata prior to this version was detecting the user and group of processes by examining the ownership of `/proc/PID/stat`. Unfortunately it seems that the owneship of files in `/proc` do not change when the process switches user. So, netdata could not detect the user and group of processes that started as root and then switched to another user. Now netdata reads `/proc/PID/status`: - process ownship information is now accurate - eliminated the need to read `/proc/PID/statm` (all the information of `/proc/PID/statm` is available in `/proc/PID/status`) - allowed netdata to read `VmSwap`, so a new chart has been added to monitor the swap memory usage per process, user and group. ![screenshot from 2018-02-24 15-07-47](https://user-images.githubusercontent.com/2662304/36630771-8b4ee42e-1974-11e8-8c70-e56f33631d70.png) - fixed issue with unreasonable spikes on processes cpu on FreeBSD (there was a typo) [#3245](https://github.com/firehol/netdata/issues/3245) - fixed issue with errors reported on FreeBSD about pid 0 [#3099](https://github.com/firehol/netdata/issues/3099) The new plugin is 20% more expensive in terms of CPU. We tried hard to optimize it, but this is as good as it can get. Read about it at [#3434](https://github.com/firehol/netdata/pull/3434) and [#3436](https://github.com/firehol/netdata/pull/3436) ## haproxy Added charts: - hrsp_1xx, hrsp_2xx, hrsp_3xx, hrsp_4xx, hrsp_5xx, hrsp_other, hrsp_total for backands and frontends - qtime, ctime, rtime, ttime metrics for backend servers - backend servers In UP state @ktarasz ## uptime netdata now uses `/proc/uptime` when `CLOCK_BOOTTIME` does not report the same uptime. In containers `CLOCK_BOOTTIME` reports the uptime of the host, while `/proc/uptime` reports the uptime of the container, so now netdata correctly reports the uptime of the container. ## mdstat various fixes to better monitor rebuild time and rate @l2isbad ## KSM - removed `to_scan` dimension - the savings % reported by netdata was less than the actual - fixed it. ## elasticsearch Added several charts for translog / indices segments statistics and JVM buffer pool utilization, which are often helpful when evaluating an elasticsearch node health [#3544](https://github.com/firehol/netdata/pull/3544) @NeonSludge ## memory monitoring - treat slab memory as cached [#3288](https://github.com/firehol/netdata/pull/3288) @amichelic - added a new chart for monitoring the memory available for use, before hitting swap ![screenshot from 2018-01-07 03-38-30](https://user-images.githubusercontent.com/2662304/34645669-79df0e3c-f35c-11e7-8c85-641e84362b71.png) - netdata now monitors Linux hugepages and transparent hugepages ![screenshot from 2018-02-24 14-28-44](https://user-images.githubusercontent.com/2662304/36630440-104cdb64-196f-11e8-8802-c0557e1f4e28.png) - added hugepages monitoring [#3462](https://github.com/firehol/netdata/pull/3462)![screenshot from 2018-02-23 15-07-26](https://user-images.githubusercontent.com/2662304/36595571-5003f482-18ab-11e8-8978-c932ad458525.png) ## diskspace monitoring - support huge amounts of mountpoints [#3258](https://github.com/firehol/netdata/pull/3258) - netdata was crashing with stack overflow due to recursion - now it is loop, so any number of mount points is supported ## network monitoring - moved tcp passive and active opens to a separate chart, to allow the TCP issues dimensions scale better by default [#3238](https://github.com/firehol/netdata/pull/3238) - updated the information presented on TCP charts to match the latest v4.15 kernel source [#3239](https://github.com/firehol/netdata/pull/3239) ## APC UPS netdata now supports monitoring multiple APC UPSes. ## ISC DHCPd netdata now also supports monitoring IPv6 leases - @l2isbad ## fronius - added a new dimension `solar_consumption` @ccremer - added alarms @ccremer ## stiebeleltron - added alarms @ccremer ## web_log Added web server response timings histogram [#3558](https://github.com/firehol/netdata/pull/3558) @Wing924 . ![2018-03-19 0 06 00](https://user-images.githubusercontent.com/8895721/37576113-f11e6ba2-2b6d-11e8-8ced-dc21cf2b4583.png) ## python.d.plugin - python.d.plugin can now start even if `/etc/netdata/python.d.conf` is missing @l2isbad - python.d.plugin now has an internal run counter @l2isbad - the unicode decoding of the plugin has been fixed ([#3406](https://github.com/firehol/netdata/issues/3406)) @l2isbad - the plugin now does not validate self-signed certificates @l2isbad - the plugin can not revive obsolete charts @l2isbad ## charts.d.plugin charts.d.plugin BASH modules can now have custom number of retries in case of data collection failures [#3524](https://github.com/firehol/netdata/pull/3524). ## web server - netdata now has a new internal web server that supports a fixed number of threads - we call it `static web server`. This web server allows netdata to work around memory fragmentation (since the treads are fixed, the underlying memory allocators reuse the same memory arenas) and cpu utilization (we can control the number of threads that will be used by netdata). This is the default now. [#3248](https://github.com/firehol/netdata/pull/3248) - now the static threads web server reports the CPU usage of each of its threads. - the HTTP response headers now include the netdata version ## dashboard - the print button now respects the URL path netdata is hosted. - dygraphs updated to the latest version - this fixes an issue that prevented netdata charts from being interactive under certain conditions - added dygraph theme `logscale` [#3283](https://github.com/firehol/netdata/pull/3283) - fontawesome updated to version 5 - d3 updated to the latest version (this broke c3 charts that require an older version) - added d3pie charts ![optimized-d3pie](https://user-images.githubusercontent.com/2662304/35773830-3bbd5d00-0966-11e8-9703-f21a8181016f.gif) - custom dashboards can now have alarms for specific roles (all, none, one or more). - allow stacked charts to zoom vertically when dimensions are selected ![peek 2018-01-27 13-35](https://user-images.githubusercontent.com/2662304/35471600-fcd1b154-0366-11e8-807c-ed3219b5a944.gif) - netdata now has a global XSS protection [#3363](https://github.com/firehol/netdata/pull/3363) ![screenshot from 2018-01-30 00-30-05](https://user-images.githubusercontent.com/2662304/35537934-c1ff2be8-0554-11e8-9996-32899ece4cd3.png) - netdata now uses intersectionObserver when available [#3280](https://github.com/firehol/netdata/pull/3280) - this improves the scrolling performance of the dashboard. - prevent date, time and units from wrapping at the charts legends [#3286](https://github.com/firehol/netdata/pull/3286) - various units scaling improvements [#3285](https://github.com/firehol/netdata/pull/3285) - added `data-common-colors="NAME"` chart option for custom dashboards [#3282](https://github.com/firehol/netdata/pull/3282). - added wiki page for creating custom dashboards on [Atlassian's Confluence](https://github.com/firehol/netdata/wiki/Custom-Dashboard-with-Confluence). ![final-confluence4](https://user-images.githubusercontent.com/2662304/34366214-767fa4b8-eaa1-11e7-83af-0b9b9b72aa73.gif) - prevented a double click on the charts' toolbox to select the text of the buttons. - fixed the alignment of dashboard icons [#3224](https://github.com/firehol/netdata/pull/3224) @xPaw - added a simple js, called [refresh-badges.js](https://github.com/firehol/netdata/blob/master/web/refresh-badges.js), to update badges on a custom web page ## badges netdata badges can now be scaled [#3474](https://github.com/firehol/netdata/pull/3474) ![screenshot from 2018-02-26 01-50-33](https://user-images.githubusercontent.com/2662304/36648114-968f625e-1a97-11e8-9971-8bfb638477b6.png) ![screenshot from 2018-02-26 01-50-55](https://user-images.githubusercontent.com/2662304/36648116-99db562a-1a97-11e8-97b3-8a967ef5228f.png) ![screenshot from 2018-02-26 01-51-21](https://user-images.githubusercontent.com/2662304/36648117-9c24060c-1a97-11e8-9715-a75bff36e38d.png) ## API - added `gtime` parameter, for **group time**. This is used to request from netdata to return values in a different rate (i.e. `gtime=60` on a `X/sec` dimension, will return `X/min`). - fixed a rounding bug in JSON generation [#3309](https://github.com/firehol/netdata/pull/3309) - the `dimensions=` parameter now supports simple patterns [#3170](https://github.com/firehol/netdata/pull/3170) and added option values `match-ids` and `match-names` to control which matches are executed for dimensions. ## alarms - `system.swap` alarms now send notifications with a 30 seconds delay, to work-around a kernel bug that incorrectly reports all swap as instantly used under containers [#3380](https://github.com/firehol/netdata/issues/3380). - added alarm to predict the time a mount point will run out of inodes [#3566](https://github.com/firehol/netdata/pull/3566). - all system alarms are now ported to FreeBSD too [#3337](https://github.com/firehol/netdata/pull/3337) @arch273 - added [alerta.io notifications](https://github.com/firehol/netdata/wiki/Alerta-monitoring-system) @kattunga ![](http://docs.alerta.io/en/latest/_images/alerta-screen-shot-3.png) - added available memory alarm ![screenshot from 2018-01-07 03-39-05](https://user-images.githubusercontent.com/2662304/34645671-81e64c80-f35c-11e7-92ef-1b9af8c42d60.png) - removed unsupported html tags from hipchat notifications. - pagerduty notifications have been modified to avoid incident duplication [#3549](https://github.com/firehol/netdata/pull/3549). - alarm definitions can now use both chart IDs and chart names (prior to this version only chart IDs were allowed). - `curl` options (eg for disabling SSL certificates verification) for `alarm-notify.sh` can now be defined in `health_alarm_notify.conf`. - netdata can now send notifications to IRC channels [#3458](https://github.com/firehol/netdata/pull/3458) @manosf IRCCloud web client: ![image](https://user-images.githubusercontent.com/31221999/36793487-3735673e-1ca6-11e8-8880-d1d8b6cd3bc0.png) Irssi terminal client:![image](https://user-images.githubusercontent.com/31221999/36793486-3713ada6-1ca6-11e8-8c12-70d956ad801e.png) ## backends - on netdata masters, allow filtering the hosts that will be sent to backends with `send hosts matching = *` pattern. - improved connection error handling and added retries to allow netdata connect to certain backends that failed with `EALREADY` or `EINPROGRESS`. - json backends now receive `host tags` (the tags have to be formatted in a json friendly way) [#3556](https://github.com/firehol/netdata/pull/3556). - re-worked the alarm that triggers when backend data are lost, to avoid flip-flops. #### prometheus backends - added URL option `timestamps=yes|no` to `/api/v1/allmetrics` to support prometheus Pushgateway [#3533](https://github.com/firehol/netdata/pull/3533) - added `netdata_info` variable with the version of netdata - renamed `netdata_host_tags` to `netdata_host_tags_info` (the old exists but is deprecated and will be removed eventually) - when prometheus uses `average` metrics, netdata remembers the last access time the prometheus collected metrics, on a per host basis. ## metrics streaming between netdata - netdata masters and proxies now expose the version of the netdata collecting the metrics, not their own. So, now a netdata master shows on the dashboard and sends to backends the version of the netdata collecting the metrics [#3538](https://github.com/firehol/netdata/pull/3538). - added `stream.conf` option `multiple connections = accept | deny` to allow or deny multiple connection for the same netdata host. The default remains `accept`, but it is likely to be changed to `no` on future versions. ## packaging - added docker hub builds for aarch64/arm64 @justin8 - updated debian containers to use stretch @justin8 - added FreeBSD init file - various installers fixes and improvements (make sure netdata is started, do not give information about features not supported on each operating system, allow non-root installations without errors, etc.) - various installer fixes for FreeBSD and MacOS - `netdata-updater` was growing the `PATH` variable on each of its runs - fixed it. - added `--accept` and `--dont-start-it` command line options to `kickstart-static64.sh` - netdata can be compiled with `long double` support (useful in embedded devices that don't support long double numbers) [#3354](https://github.com/firehol/netdata/pull/3354) - fixed `netdata.spec` to allow building netdata on older and newer rpm based distros. Also [added a script to build a netdata rpm](https://github.com/firehol/netdata/blob/master/contrib/rhel/build-netdata-rpm.sh) - static netdata installer now tries to find the location of the SSL ca-certificates on a system and properly configured the static `curl` provided with this path. - the netdata updater starts netdata only if it was running - added alpine dockerfile ## other - added global option `gap when lost iterations` to control the number of iterations that should be lost to show a gap on the charts. - various fixes/improvements related to netdata logs - the main change is that now netdata logs the thread name that logged the message, providing helpful insights about the thread that complained. - re-worked the exit procedure of netdata to allow it cleanup properly - sometimes netdata was deadlocked during exit, waiting forever - now netdata always exits promptly [#3184](https://github.com/firehol/netdata/pull/3184) - fixed compilation on ancient gcc versions - netdata was always setting itself to the `idle` process scheduling priority, even when it was configured to do otherwise. Fixed it [#3523](https://github.com/firehol/netdata/pull/3523) 2018-03-27T20:52:12+00:00 netdata v1.11.0 netdata v1.11.0 2018-11-06T09:18:22+00:00 ### New to netdata? Check its demo: [https://my-netdata.io](http://my-netdata.io) > [![User Base](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&label=user%20base&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Monitored Servers](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&label=servers%20monitored&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Served](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&label=sessions%20served&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) > > [![New Users Today](http://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=persons&after=-86400&options=unaligned&group=incremental-sum&label=new%20users%20today&units=null&value_color=blue&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![New Machines Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_entries&dimensions=machines&group=incremental-sum&after=-86400&options=unaligned&label=servers%20added%20today&units=null&value_color=orange&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) [![Sessions Today](https://registry.my-netdata.io/api/v1/badge.svg?chart=netdata.registry_sessions&after=-86400&group=incremental-sum&options=unaligned&label=sessions%20served%20today&units=null&value_color=yellowgreen&precision=0&v42)](https://registry.my-netdata.io/#menu_netdata_submenu_registry) --- Hi all, It has been 8 months since the last release of Netdata. We delayed releases a bit, but as you can see on these release notes, we were working hard to provide **the best Netdata ever**. Thanks to [synacktiv.com](https://www.synacktiv.com/en/) and [red4sec.com](https://www.red4sec.com/en), we fixed a number of vulnerabilities in the code base (check below), so release 1.11 of Netdata is **the most secure Netdata** so far. **All users are advised to update to this version asap.** Netdata now has its own organization on GitHub. So, we moved from `firehol/netdata` to `netdata/netdata`! We also provide new docker images as [`netdata/netdata`](https://hub.docker.com/r/netdata/netdata/) (the old ones are deprecated and are not updated any more). Netdata community grows faster than ever. Currently netdata grows by +2k unique users and +1k unique installations **per day**, every day! Contributions sky rocket too. To make it even easier for newcomers to get involved, we modularized all the code, now organized into a hierarchy of directories. We also moved most of the documentation, from the wiki into the repo. This is quite unique. Netdata is one of the first projects that organizes code and docs under the same hierarchy. Browse the repo; **you will be surprised!** Examples: [data collection plugins](https://github.com/netdata/netdata/tree/master/collectors#data-collection-plugins), [database](https://github.com/netdata/netdata/tree/master/database), [backends](https://github.com/netdata/netdata/tree/master/backends), [web server](https://github.com/netdata/netdata/tree/master/web/server), [ARL, including benchmarks](https://github.com/netdata/netdata/tree/master/libnetdata/adaptive_resortable_list), etc. Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful to hundreds of thousands of admins, devops and developers around the world! You rock! @ktsaou --- #### Automatic Updates broken There was an accidental breaking change in the master repo of netdata. All users that use automatic updates, are advised to run: ```sh sudo sh -c 'cd /usr/src/netdata.git && git fetch --all && git reset --hard origin/master && ./netdata-updater.sh -f' ``` After that, `netdata-updater` will be able to update your netdata. --- #### Stock config files are now in `/usr/lib/netdata` We prepare netdata for binary packages. This required stock config files to be overwritten unconditionally when new netdata binary packages are installed. So, all config files we ship with netdata are now installed under `/usr/lib/netdata/conf.d`. To edit config files, we have supplied the script `/etc/netdata/edit-config` that automatically moves the config file you need to edit to `/etc/netdata` and opens an editor for you. --- #### New query engine The [query engine of netdata](https://github.com/netdata/netdata/tree/master/web/api/queries) has been re-written to support query plugins. We have already added the following algorithms that are available for alarm, charts and badges: - [`stddev`](https://github.com/netdata/netdata/tree/master/web/api/queries/stddev), for calculating the **standard deviation** on any time-frame. - [`ses` or `ema` or `ewma`](https://github.com/netdata/netdata/tree/master/web/api/queries/ses), for calculating the **exponential weighted moving average**, or **single/simple exponential smoothing** on any time-frame. - [`des`](https://github.com/netdata/netdata/tree/master/web/api/queries/des), for calculating the **double exponential smoothing** on any time-frame. - [`cv` or `rsd`](https://github.com/netdata/netdata/tree/master/web/api/queries/stddev#coefficient-of-variation-cv), for calculating the **coefficient of variation** for any time-frame. --- ## Fixed Security Issues #### Identified by Red4Sec.com - `CVE-2018-18836` Fixed JSON Header Injection (an attacker could send `\n` encoded in the request to inject a JSON fragment into the response). - `CVE-2018-18837` Fixed HTTP Header Injection (an attacker could send `\n` encoded in the request to inject an HTTP header into the response). - `CVE-2018-18838` Fixed LOG Injection (an attacker could send `\n` encoded in the request to inject a log line at `access.log`). - `CVE-2018-18839` **Not fixed** Full Path Disclosure, since these are **intended** (netdata reports the absolute filename of web files, alarm config files and alarm handlers). #### Identified by Synacktiv - Fixed Privilege Escalation by manipulating `apps.plugin` or `cgroup-network` error handling. - Fixed LOG injection (by sending URLs with `\n` in them). --- ## Packaging - Our **official docker hub images** are now at [`netdata/netdata`](https://hub.docker.com/r/netdata/netdata/). These images are based on **Alpine Linux** for optimal footprint. We provide images for `i386`, `amd64`, `aarch64` and `armhf`. - the supplied `netdata.service` now allows configuring process scheduling priorities exclusively on `netdata.service` (no need to change `netdata.conf` too). - the supplied `netdata.service` is now installed in `/usr/lib/systemd/system`. - Stock netdata configurations are now installed in `/usr/lib/netdata/conf.d` and a new script has been added to allow easily copying and editing config files: `/etc/netdata/edit-config`. --- ## New Data Collection Modules - `rethinkdbs` for monitoring RethinkDB performance - `proxysql` for monitoring ProxySQL performance - `litespeed` for monitoring LiteSpeed web server performance. - `uwsgi` for monitoring uWSGI performance - `unbound` for monitoring the performance of Unbound DNS servers. - `powerdns` for monitoring the performance of PowerDNS servers. - `dockerd` for monitoring the health of dockerd - `puppet` for monitoring Puppet Server and Puppet DB. - `logind` for monitoring the number of active users. - `adaptec_raid` and `megacli` for monitoring the relevant raid controller - `spigotmc` for monitoring minecraft server statistics - `boinc` for monitoring Berkeley Open Infrastructure Network Computing clients. - `w1sensor` for monitoring multiple 1-Wire temperature sensors. - `monit` for collecting process, host, filesystem, etc checks from monit. - `linux_power_supplies` for monitoring Linux Power Supplies attributes --- ## Data Collection Orchestrators Changes - `node.d.plugin` does not use the `js` command any more. - `python.d.plugin` now uses `monotonic` clocks. There was a discrepancy in clocks used in netdata that resulted in a shift in time of python module after some time (it was missing 1 sec per day). - added `MySQLService` for quickly adding plugins using mysql queries. - `URLService` now supports self-signed certificates and supports custom client certificates. - all `python.d.plugin` modules that require `sudo` to collect metrics, are now disabled by default, to avoid security alarms on installations that do not need them. --- ## Improved Data Collection Modules - `apps.plugin` now detects changes in process file descriptors, also fixed a couple of memory leaks. Its default configuration has been enriched significantly, especially for IoT. - `freeipmi.plugin` now supports option `ignore-status` to ignore the status reported by given sensors. #### `statsd.plugin` (for collecting custom APM metrics) - The charting thread has been optimized for lowering its CPU consumption when several millions of metrics are collected. - `sets` now report zeros instead of gaps when no data are collected - `histograms` and `timers` have been optimized for lowering their CPU consumption to support several thousands of such metrics are collected. - `histograms` had wrong sampling rate calculations. - `gauges` now ignore sampling rate when no sign is included in the value. - the minimum sampling rate supported is now 0.001. - netdata statsd is now drop-in replacement for datadog statsd (although statsd tags are currently ignored by netdata). #### `proc.plugin` (Linux, system monitoring) - Unused interrupts and softirqs are not used in charts (this saves quite some processing power and memory on systems with dozens of CPU cores). - fixed `/proc/net/snmp` parsing of `IcmpMsg` lines that failed on a few systems. - Veritas Volume Manager disks are now recognized and named accordingly. - Now netdata collects `TcpExtTCPReqQFullDrop` and re-organizes metrics in charts to properly monitor the TCP SYN queue and the TCP Accept queue of the kernel. - Many charts that were previously reported as IPv4, where actually reflecting metrics for both IPv4 and IPv6. They have been renamed to `ip.*`. - netdata now monitors `SCTP`. - Fixed BTRFS over BCACHE sector size detection. - BCACHE data collection is now faster. - `/proc/interrupts` and `/proc/softirqs` parsing fixes. #### `diskspace.plugin` (Linux, disk space usage monitoring) - It does not `stat()` excluded mount points any more (it was interfering with kerberos authenticated mount points). - several filesystems are now by default excluded from disk-space monitoring, to avoid breaking suspend on workstations. #### `freebsd.plugin` (FreeBSD, PFSense, system monitoring) - `loundry` memory is now monitored. - `system.net` and `system.packets` charts added that report the total bandwidth and packets of all physical network interfaces combined. #### `python.d.plugin` PYTHON modules (applications monitoring) - `web_log` module now supports virtual hosts, reports http/https metrics, support `squid` logs - `nginx_plus` module now handles non-continuous peer IDs (bug fix) - `ipfs` module is optimized, the use of its Pin API is now disabled by default and can enabled with a netdata module option (using the IPFS Pin API increases the load on the IPFS server). - `fail2ban` module now supports IPv6 too. - `ceph` module now checks permissions and properly reports issues - `elasticsearch` module got better error handling - `nginx_plus` module now uses upstream `ip:port` instead of transient id to identify dimensions. - `redis`, now it supports Pika, collects evited keys, fixes authentication issues reported and improves exception handling. - `beanstalk`, bug fix for yaml config loading. - `mysql`, the % of active connections is now monitored, query types are also charted. - `varnish`, now it supports versions above 5.0.0 - `couchdb` - `phpfpm`, now supports IPv6 too. - `apache`, now supports IPv6 too. - `icecast` - `mongodb`, added support for connect URIs - `postgress` - `elasticsearch`, now it supports versions above 6.3.0, fixed JSON parse errors - `mdstat` , now collects `mismatch_cnt` - `openvpn_log` #### `node.d.plugin` NODE.JS modules - `snmp` was incorrectly parsing a new OID names as float. Fixed it. #### `charts.d.plugin` BASH modules - `nut` now supports naming UPSes. --- ## Health Monitoring - Added variable `$system.cpu.processors`. - Added alarms for detecting abnormally high load average. - `TCP` `SYN` and `TCP` accept queue alarms, replacing the old softnet dropped alarm that was too generic and reported many false positives. - system alarms are now enabled on FreeBSD. - netdata now reads NIC speed and sets alarms on each interface to detect congestion. - Network alarms are now relaxed to avoid false positives. - New `bcache` alarms. - New `mdstat` alarms. - New `apcupsd` alarms. - New `mysql` alarms. - New notification methods: - **rocket.chat** - **Microsoft Teams** - **syslog** - **fleep.io** - **Amazon SNS** --- ## Backends - Host tags are now sent to Graphite - Host variables are now sent to Prometheus --- ## Streaming - Each netdata slave and proxy now filter the charts that are streamed. This allows exposing netdata masters to third parties by limiting the number of charts available at the master. - Fixed a bug in streaming slaves that randomly prevented them to resume streaming after network errors. - Fixed a bug that on slaves that sent duplicated chart names under certain conditions. - Fixed a bug that caused slaves to consume 100% CPU (due to a misplaced lock) when multiple threads were adding dimensions on the same chart. - The receiving nodes of streaming (netdata masters and proxies) can now rate-limit the rate of inbound streaming requests received. - Re-worked time synchronization between netdata slaves and masters. --- ## API - Badges that report time, now show `undefined` instead of `never`. --- ## Dashboard - Added `UTC` timezone to the list of available time-zones. - The dashboard was sending some non-HTTP compliant characters at the URLs that made netdata dashboards break when used under certain proxies. Fixed. 2018-11-06T09:18:22+00:00 netdata v1.11.1 netdata v1.11.1 2018-11-22T21:56:11+00:00 This is a patch - bug fix release of netdata. Our work to move all the documentation inside the repo is still in progress. Everything has been moved, but still we need to refactor a lot of the pages to be more meaningful. The README file on netdata home has been rewritten. [Check it here](https://github.com/netdata/netdata#netdata----). ## Improved internal database Overflown incremental values (counters) do not show a zero point at the charts. Netdata detects the width (8bit, 16bit, 32bit, 64bit) of each counter and properly calculates the delta when the counter overflows. The internal database format has been extended to support values above 64bit. ## New data collection plugins 1. `openldap`, to collect performance statistics from OpenLDAP servers. 2. `tor`, to collect traffic statistics from Tor. 3. `nvidia_smi` to monitor NVIDIA GPUs. ## Improved data collection plugins - **BUG FIX**: network interface names with colon (`:`) in them were incorrectly parsed and resulted in faulty data collection values. - **BUG FIX**: `smartd_log` has been refactored, has better python v2 compatibility, and now supports SCSI smart attributes - `cpufreq` has been re-written in C - since this module if common, we decided to convert to an internal plugin to lower the pressure on the python ones. There are a few more that will be transitioned to C in the next release. - **BUG FIX**: `sensors` got some compatibility fixes and improved handling for `lm-sensors` errors. ## Health monitoring - **BUG FIX**: max network interface speed data collection was faulty, which resulted in false-positive alarms on systems with multiple interfaces using different speeds (the speed of the first network interface was used for all network interfaces). Now the interface speed is shown as a badge: ![image](https://user-images.githubusercontent.com/2662304/48292282-610e2b00-e482-11e8-95e6-478094160f4f.png) - `alerta.io` notifications got a few improvements - **BUG FIX**: `conntrack_max` alarm has been restored (was not working due to an invalid variable name referenced) ## Registry (`my-netdata` menu) It has been refactored a bit to reveal the URLs known for each node and now it supports deleting individual URLs. ## Packaging - `openrc` service definition got a few improvements 2018-11-22T21:56:11+00:00 netdata v1.12.0 netdata v1.12.0 2019-02-14T11:24:29+00:00 ## At a glance Release 1.12 is made out of 211 pull requests and 22 bug fixes. The key improvements are: - Introducing `netdata.cloud`, the free netdata service for all netdata users - High performance plugins with go.d.plugin (data collection orchestrator written in Go) - 7 new data collectors and 11 rewrites of existing data collectors for improved performance - A new management API for all netdata servers - Bind different functions of the netdata APIs to different ports - Improved installation and updates ## netdata.cloud `netdata.cloud` is a free service for all netdata users. Currently it replaces the old netdata registry, while providing single sign on with GitHub and Google accounts. Using `netdata.cloud` we plan to provide the following features: - distributed authentication (password protection) for all netdata installations - network view for all nodes - cross node custom dashboard editor, storage and sharing - centralized health monitoring and alarm notifications and many more. Read more about `netdata.cloud` [here](https://netdata.cloud/about). ## Bind API functions to different ports netdata can now bind its API functions to different ports. The following API functions can be isolated: - `dashboard` for access the dashboard - `badges` for generating badges - `streaming` for receiving streamed metrics from remote netdata servers - `management` for receiving management commands - `registry` for accessing the netdata registry - `netdata.conf` for downloading the current configuration To bind API functions to different ports, append `=function|function|...` to the port definition, like this: ``` [web] bind to = *:19999=dashboard|netdata.conf *:20000=streaming ``` The above will bind netdata: - on all IPs (`*`) at port `19999` for dashboard access and access to `netdata.conf` - on all IPs (`*`) at port `20000` for receiving streamed data from remote netdata servers For more information about binding API functions to different ports, [check this](https://docs.netdata.cloud/web/server/#binding-netdata-to-multiple-ports). ## Management API Netdata now has a management API. We plan to provide a full set of configuration commands using this API. In this release, the management API supports disabling or silencing alarms during maintenance periods. For more information about the management API, check [this](https://docs.netdata.cloud/web/api/health/#health-management-api). ## Anonymous statistics Anonymous usage information is collected by default and sent to Google Analytics. The statistics calculated from this information will be used for: 1. **Quality assurance**, to help us understand if netdata behaves as expected and help us identify repeating issues for certain distributions or environment. 2. **Usage statistics**, to help us focus on the parts of netdata that are used the most, or help us identify the extend our development decisions influence the community. Information is sent to Netdata via two different channels: - Google Tag Manager is used when an agent's dashboard is accessed. - The script `anonymous-statistics.sh` is executed by the Netdata daemon, when Netdata starts, stops cleanly, or fails. Both methods are controlled via the same [opt-out mechanism](https://docs.netdata.cloud/docs/anonymous-statistics/#opt-out). For more information, [check this](https://docs.netdata.cloud/docs/anonymous-statistics/). ## Data collection This release introduces a new Go plugin orchestrator. This plugin has its own [github repo](https://github.com/netdata/go-orchestrator). It is open-source, using the same license and we welcome contributions. The orchestrator can also be used to build custom data collection plugins written in Go. We have used the orchestrator to write many new Go plugins in our [go.d plugin github repo](https://github.com/netdata/go.d.plugin). For more information, [check this](https://github.com/netdata/go-orchestrator#go-orchestrator-wip). New data collectors: - Activemq (Go) - Consul (Go) - Lighttpd2 (Go) - Solr (Go) - Springboot2 (Go) - mdstat - nonredundant arrays (C) - CUPS printing system (C) High performance versions of older data collectors: - apache (Go) - dns_query (Go) - Freeradius (Go) - Httpcheck (Go) - Lighttpd (Go) - Portcheck (Go) - Nginx (Go) - cpufreq (C) - cpuidle (C) - mdstat (C) - power supply (C) Other improved data collectors: - Fix the python plugin clock (collectors falling behind). - adaptec_raid: add to python.d.conf. - apcupsd: Detect if UPS is online. - apps: Fix process statistics collection for FreeBSD. - apps: Properly lookup docker container name when running in ECS. - fail2ban: Add 'Restore Ban' action. - go_expavar: Don't check for duplicate expvars. - hddtemp: Don't use disk model as dim name. - megacli: add to python.d.conf. - nvidia_smi: handle `N/A` values. - postgres: Fix integer out of range error on Postgres 11, fix locks count. - proc: Don't show zero charts for ZFS filesystem. - proc; Fix cached memory calculation. - sensors: Don't ignore 0 RPM fans on start. - smartd_log: check() unhandled exception: list index out of range. - SNMP: Gracefully ignore the offset if the value is not a number. ## Packaging and Installation - Upload nightly builds to Google Cloud. Use the nightlies in new installations and updates. - Improved uninstaller. - Scramble packages in docker images with polymorphic Linux. - Building RPMs: Fix permissions for log files, remove rolling version suffix. ## Health Monitoring - Add Prowl notifications for iOS users. - Show count of active alarms per state in email notifications. - Show evaluated expression and expression variable values in email notifications. - Improve support for slack recipients (channels/users). - Custom notifications: Fix bug with alarm role recipients. ## Dashboards - Server filtering in `my-netdata` menu when signed in to `netdata.cloud` - All units are now IEC-compliant abbreviations (KiB, MiB etc.). - GUI: Make entire row clickable in the registry menu showing the list of servers. ## Backends - Do not report stale metrics to prometheus. ## Other - Deprecated multi-threaded and single-threaded web servers, in preparation for Windows support. - Documentation improvements. - Treat `DT_UNKNOWN` files as regular files. - API: Stricter rules for URL separators. 2019-02-14T11:24:29+00:00 netdata v1.12.1 netdata v1.12.1 2019-02-21T19:28:06+00:00 Patch release 1.12.1 contains 22 bug fixes and 8 improvements. ### Bug Fixes - Fix SIGSEGV at startup: Don't free vars of charts that do not exist [\#5455](https://github.com/netdata/netdata/pull/5455) - Add timeouts to the installer for the go.d plugin and update the installer documentation for servers with no internet access. - Prevent invalid Linux power supply alarms during startup [\#5447](https://github.com/netdata/netdata/pull/5447) - Correct duplicate flag enum in health.h [\#5441](https://github.com/netdata/netdata/pull/5441) - Remove extra 'v' for netdata version from Server response header [\#5440](https://github.com/netdata/netdata/pull/5440) and spec URL [\#5427](https://github.com/netdata/netdata/pull/5427) - Fix curl download in installer [\#5439](https://github.com/netdata/netdata/pull/5439) - apcupsd - Treat ONBATT status the same as ONLINE [\#5435](https://github.com/netdata/netdata/pull/5435) - Fix \#5430 - LogService.\_get\_raw\_data under python3 fails on undecodable data [\#5431](https://github.com/netdata/netdata/pull/5431) - Correct version check in UI [\#5429](https://github.com/netdata/netdata/pull/5429) - Fix ERROR 405: Cannot download charts index from server - cpuidle handle newlines in names [\#5425](https://github.com/netdata/netdata/pull/5425) - Improve configure.ac mnl and netfilter\_acc checks for static builds [\#5424](https://github.com/netdata/netdata/pull/5424) - Fix clock\_gettime\(\) failures with the CLOCK\_BOOTTIME argument [\#5415](https://github.com/netdata/netdata/pull/5415) - Use netnsid for detecting cgroup networks; [\#5413](https://github.com/netdata/netdata/pull/5413) - Python module sensors fix [\#5406](https://github.com/netdata/netdata/pull/5406) ([ilyam8](https://github.com/ilyam8)) - Fix kickstart-static64.sh script [\#5397](https://github.com/netdata/netdata/pull/5397) - Fix ceph.chart.py for Python3 [\#5396](https://github.com/netdata/netdata/pull/5396) ([GaetanF](https://github.com/GaetanF)) - Added missing BuildRequires for autoconf, automake [\#5363](https://github.com/netdata/netdata/pull/5363) - Fix wget log spam in headless mode \(fixes \#5356\) [\#5359](https://github.com/netdata/netdata/pull/5359) - Fix warning condition for mem.available [\#5353](https://github.com/netdata/netdata/pull/5353) - cups.plugin: Support older versions [\#5350](https://github.com/netdata/netdata/pull/5350) - Fix AC\_CHECK\_LIB to work correctly with cups library [\#5349](https://github.com/netdata/netdata/pull/5349) - Fix issues reported by Codacy ### Improvements - Add driver-type option to the freeipmi plugin [\#5384](https://github.com/netdata/netdata/pull/5384) - Add support of tera-byte size for Linux bcache. [\#5373](https://github.com/netdata/netdata/pull/5373) - Split nfacct plugin into separate process [\#5361](https://github.com/netdata/netdata/pull/5361) - Localization support in HTML docs, simplification of checklinks.sh [\#5342](https://github.com/netdata/netdata/pull/5342) - Cleanup updater script and no `/opt` usage [\#5218](https://github.com/netdata/netdata/pull/5218) - Add cgroup cpu and memory limits and alarms [\#5172](https://github.com/netdata/netdata/pull/5172) - Add message queue statistics [\#5115](https://github.com/netdata/netdata/pull/5115) - Documentation improvements 2019-02-21T19:28:06+00:00 netdata v1.12.2 netdata v1.12.2 2019-02-28T18:21:31+00:00 Patch release 1.12.2 contains 7 bug fixes and 4 improvements. ### At a glance The main motivation behind a new patch release is the introduction of a **stable release channel**. A "stable" installation and update channel was always on our roadmap, but it became a necessity when we realized that our users in China could not use the nightly releases published on Google Cloud. The "stable" channel is based on our official GitHub releases and uses assets hosted on GitHub. We are also introducing a new **Oracle DB collector** module, implemented in Python. ### Bug Fixes - Installer at https://my-netdata.io/kickstart.sh isnt updated to master branch [\#5492](https://github.com/netdata/netdata/issues/5492) - Zombie processes exist after restart netdata - add heartbeat to python.d plugin [\#5491](https://github.com/netdata/netdata/issues/5491) - Verbose curl output causes unwanted emails from netdata-updater cronjob [\#5484](https://github.com/netdata/netdata/issues/5484) - RocketChat notifications not working [\#5470](https://github.com/netdata/netdata/issues/5470) - go.d.plugin installation fails due to insufficient timeout [\#5467](https://github.com/netdata/netdata/issues/5467) - SIGSEGV crash during shutdown of tc plugin [\#5366](https://github.com/netdata/netdata/issues/5366) - CMake warning for nfacct plugin [\#5379](https://github.com/netdata/netdata/pull/5379) ### Improvements - Introduce stable installation channel [\#5487](https://github.com/netdata/netdata/pull/5487) - Oracledb python module [\#5421](https://github.com/netdata/netdata/pull/5421) - Show streamed servers even for users that are not signed in [\#5519](https://github.com/netdata/netdata/pull/5519) - Prevent merging changes to kickstart.sh when checksum in docs is wrong [\#5498](https://github.com/netdata/netdata/pull/5498) 2019-02-28T18:21:31+00:00 netdata v1.13.0 netdata v1.13.0 2019-03-14T20:11:22+00:00 Release 1.13 contains 14 bug fixes and 8 improvements. ### At a glance netdata has taken the first step into the world of Kubernetes, with a beta version of a [Helm chart](https://github.com/netdata/helmchart) for deployment to a k8s cluster and [proper naming](https://github.com/netdata/netdata/pull/5576) of the cgroup containers. We have [big plans](https://github.com/netdata/netdata/issues/5392) for Kubernetes, so stay tuned! A [major refactoring of the python.d plugin](https://github.com/netdata/netdata/pull/5552) has resulted in a dramatic decrease of the required memory, making netdata even more resource efficient. We also added charts for IPC shared memory segments and total memory used. ### Acknowledgements: - [varyumin](https://github.com/varyumin), who graciously shared the original Kubernetes Helm chart and is still helping improve it - [p-thurner](https://github.com/p-thurner) for his great work on the SSL certificate expiration module. - [Ferroin](https://github.com/Ferroin) for his priceless insights and assistance - [Jaxmetalmax](https://github.com/Jaxmetalmax) for graciously helping us identify and fix postgress connection issues ### Improvements - Kubernetes: Helm chart (https://github.com/netdata/helmchart) and proper cgroup naming [\#5576](https://github.com/netdata/netdata/pull/5576) ([cakrit](https://github.com/cakrit)) - python.d.plugin: Reduce memory usage with separate process for initial module checking [\#5552](https://github.com/netdata/netdata/pull/5552) ([ilyam8](https://github.com/ilyam8)) and loaders cleanup [\#5602](https://github.com/netdata/netdata/pull/5602) ([ilyam8](https://github.com/ilyam8)) - IPC shared memory charts [\#5522](https://github.com/netdata/netdata/pull/5522) ([vlvkobal](https://github.com/vlvkobal)) - mysql module add ssl connection support [\#5610](https://github.com/netdata/netdata/pull/5610) ([ilyam8](https://github.com/ilyam8)) - FreeIPMI: Have the debug option apply the internal freeipmi debug flags [\#5548](https://github.com/netdata/netdata/pull/5548) ([cakrit](https://github.com/cakrit)) - Prometheus backend: Support legacy metric names for source=avg [\#5531](https://github.com/netdata/netdata/pull/5531) ([cakrit](https://github.com/cakrit)) - Registry: Allow deleting the host we are looking at [\#5537](https://github.com/netdata/netdata/pull/5537) ([cakrit](https://github.com/cakrit)) - SpigotMC: Use regexes for parsing. [\#5507](https://github.com/netdata/netdata/pull/5507) ([Ferroin](https://github.com/Ferroin)) ### Bug Fixes - Postgres: fix connection issues [\#5618](https://github.com/netdata/netdata/pull/5618) ([Jaxmetalmax](https://github.com/Jaxmetalmax)), [\#5617](https://github.com/netdata/netdata/pull/5617) ([ilyam8](https://github.com/ilyam8)) - Proxmox container: Fix cgroup naming [\#5612](https://github.com/netdata/netdata/pull/5612) ([vlvkobal](https://github.com/vlvkobal)) and use total\_\* memory counters for cgroups [\#5592](https://github.com/netdata/netdata/pull/5592) ([vlvkobal](https://github.com/vlvkobal)) - proc.plugin and plugins.d: Fix memory leaks [\#5604](https://github.com/netdata/netdata/pull/5604) ([vlvkobal](https://github.com/vlvkobal)) - SpigotMC: Fix UnicodeDecodeError [\#5598](https://github.com/netdata/netdata/pull/5598) ([ilyam8](https://github.com/ilyam8)) and py2 compatibility fix [\#5593](https://github.com/netdata/netdata/pull/5593) ([ilyam8](https://github.com/ilyam8)) - Fix non-obsolete dimension deletion [\#5563](https://github.com/netdata/netdata/pull/5563) ([vlvkobal](https://github.com/vlvkobal)) - UI: Fix incorrect icon for the streaming master \#5560 [\#5561](https://github.com/netdata/netdata/pull/5561) ([gmosx](https://github.com/gmosx)) - Docker container names: Retry renaming when a name is not found [\#5557](https://github.com/netdata/netdata/pull/5557) ([vlvkobal](https://github.com/vlvkobal)) - apps.plugin: Don't send zeroes for empty process groups [\#5540](https://github.com/netdata/netdata/pull/5540) ([vlvkobal](https://github.com/vlvkobal)) - go.d.plugin: Correct sha256sum check [\#5539](https://github.com/netdata/netdata/pull/5539) ([cakrit](https://github.com/cakrit)) - Unbound module: Documentation corrected with troubleshooting section. [\#5528](https://github.com/netdata/netdata/pull/5528) ([Ferroin](https://github.com/Ferroin)) - Streaming: Prevent UI issues upon GUID duplication between master and slave netdata instances [\#5511](https://github.com/netdata/netdata/pull/5511) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Linux power supply module: Fix missing zero dimensions [\#5395](https://github.com/netdata/netdata/pull/5395) ([vlvkobal](https://github.com/vlvkobal)) - Minor fixes around plugin\_directories initialization [\#5536](https://github.com/netdata/netdata/pull/5536) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) 2019-03-14T20:11:22+00:00 netdata v1.14.0 netdata v1.14.0 2019-04-26T07:38:02+00:00 Release 1.14 contains 14 bug fixes and 24 improvements. ### At a glance The release introduces major additions to Kubernetes monitoring, with tens of new charts for [Kubelet](https://docs.netdata.cloud/collectors/go.d.plugin/modules/k8s_kubelet/), [kube-proxy](https://docs.netdata.cloud/collectors/go.d.plugin/modules/k8s_kubeproxy/) and [coredns](https://github.com/netdata/go.d.plugin/tree/master/modules/coredns) metrics, as well as significant improvements to the netdata [helm chart](https://github.com/netdata/helmchart/). Two new collectors were added, to monitor [Docker hub](https://docs.netdata.cloud/collectors/go.d.plugin/modules/dockerhub/) and [Docker engine](https://docs.netdata.cloud/collectors/go.d.plugin/modules/docker_engine/) metrics. Finally, v1.14 adds support for [version 2 cgroups](https://github.com/netdata/netdata/pull/5407), [OpenLDAP over TLS](https://github.com/netdata/netdata/pull/5859), [NVIDIA SMI free and per process memory](https://github.com/netdata/netdata/pull/5796/files) and [configurable syslog facilities](https://github.com/netdata/netdata/pull/5792). ### Acknowledgements Our contributors kicked the ball out of the park this time. Our thanks go to the following people: @ekartsonakis for the excellent addition of TLS support to the OpenLDAP collector @Wing924 whose cat apparently leaves him enough time to help us with springboot2 and a lot more! @huww98 for his contribution to the NVIDIA SMI plugin. @varyumin for his help on the Kubernetes helm chart. @skrzyp1 for the very significant addition of cgroup v2 support @hsegnitz for his contribution to the web server log plugin. @archisgore for the quick fixes to the Polyverse-enabled docker image. @tctovsli for his Rocket Chat notifications improvements. @JoeWrightss and @vinyasmusic for not letting us get away with spelling mistakes. @andvgal for the addition to the MongoDB collector. @piiiggg for the apache proxy documentation fix @Ferroin for general awesomeness. ### Bug Fixes - Fixed cases where the netdata version produced by the binary or the configure tools of the source code was wrong. Instead of getting something like `netdata-v1.14.0-rc0-39a9sf9g` we would get a `netdata-39a9sf9g`. [\#5860](https://github.com/netdata/netdata/pull/5860) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed unexpected crashes of the python plugin on macOS, caused by new security changes made in High Sierra. [\#5838](https://github.com/netdata/netdata/pull/5838) ([ilyam8](https://github.com/ilyam8)) - Fixed problem autodetecting failed jobs in python.d plugin. It now properly restarts jobs that are being rechecked, as soon as they are able to run. [\#5837](https://github.com/netdata/netdata/pull/5837) ([ilyam8](https://github.com/ilyam8)) - CouchdDB monitoring would stop sometimes with an exception. Fixed the unhandled exception causing the issue. [\#5833](https://github.com/netdata/netdata/pull/5833) ([ilyam8](https://github.com/ilyam8)) - The netdata api deliberately returned http error 400 when netdata ran in memory mode none. Modified the behavior to return responses, regardless of the memory mode [\#5819](https://github.com/netdata/netdata/pull/5819) ([cakrit](https://github.com/cakrit)) - The python.d plugin sometimes does not receive `SIGTERM` when netdata exits, resulting in zombie processes. Added a heartbeat so that the process can exit on `SIGPIPE`. [\#5797](https://github.com/netdata/netdata/pull/5797) ([ilyam8](https://github.com/ilyam8)) - The new SMS Server Tools notifications did not handle errors well, resulting in cryptic error messages. Improved error handling. [\#5770](https://github.com/netdata/netdata/pull/5770) ([cakrit](https://github.com/cakrit)) - The installers would crash on some FreeBSD systems, because `sha256sum` used by the installers is not available on all FreeBSD installations. Modified the installers to properly support FreeBSD. [\#5760](https://github.com/netdata/netdata/pull/5760) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Running netdata behind a proxy in FreeBSD did not work, when using UNIX sockets. Added special handling of UNIX sockets for FreeBSD. [\#5756](https://github.com/netdata/netdata/pull/5756) ([vlvkobal](https://github.com/vlvkobal)) - Fixed sporadic build failures of our Docker image, due to dependencies on the Polyverse package ( APK broken state). [\#5751](https://github.com/netdata/netdata/pull/5751) ([archisgore](https://github.com/archisgore)) - Fix segmentation fault in streaming, when two dimensions had similar names. [\#5882](https://github.com/netdata/netdata/pull/5882) ([vlvkobal](https://github.com/vlvkobal)) - Kubernetes Helm Chart: Fixed incorrect use of namespaces in ServiceAccount and ClusterRoleBinding [RBAC fixes](https://github.com/netdata/helmchart/pull/11) ([varyumin](https://github.com/varyumin)). - Elastic search: The option to enable HTTPS was not included in the config file, giving the erroneous impression that HTTPS was not supported. The option was added. [\#5834] (https://github.com/netdata/netdata/pull/5834) ([ilyam8](https://github.com/ilyam8)) - RocketChat notifications were not being sent properly. Added default recipients for roles in the health alarm notification configuration. [\#5545](https://github.com/netdata/netdata/pull/5545) ([tctovsli](https://github.com/tctovsli)) ### Improvements - go.d.plugin [v0.4.0](https://github.com/netdata/go.d.plugin/releases/tag/v0.4.0) : Docker Hub and k8s coredns collectors, springboot2 URI filters support. - go.d.plugin [v0.3.1](https://github.com/netdata/go.d.plugin/releases/tag/v0.3.1) : Add default job to run k8s_kubelet.conf, k8s_kubeproxy, activemq modules - go.d.plugin [v0.3.0](https://github.com/netdata/go.d.plugin/releases/tag/v0.3.0) : Docker engine, kubelet and kub-proxy collectors. x509check module reading certs from file support - Added unified cgroup support that includes v2 cgroups [\#5407](https://github.com/netdata/netdata/pull/5407) ([skrzyp1](https://github.com/skrzyp1)) - Disk stats: Added preferred disk id pattern, so that users can see the id they prefer, when multiple ids appear for the same device [\#5779](https://github.com/netdata/netdata/pull/5779) ([vlvkobal](https://github.com/vlvkobal)) - NVIDIA SMI: Added memory free and per process memory usage charts to the collector [\#5796](https://github.com/netdata/netdata/pull/5796) ([huww98](https://github.com/huww98)) - OpenLDAP: Added TLS support, to allow monitoring of LDAPS. [\#5859](https://github.com/netdata/netdata/pull/5859) ([ekartsonakis](https://github.com/ekartsonakis)) - PHP-FPM: Add health check to raise alarms when the phpfm server is unreachable [\#5836](https://github.com/netdata/netdata/pull/5836) ([ilyam8](https://github.com/ilyam8)) - PostgreSQL: Our configuration options to connect to a DB did not support all possible option. Added option to connect to a PostreSQL instance by defining a connection string (URI). [\#5758](https://github.com/netdata/netdata/pull/5758) ([ilyam8](https://github.com/ilyam8)) - python.d.plugin: There was no way to delete obsolete dimensions in charts created by the python.d plugin. The plugin can now delete dimension at runtime. [\#5795](https://github.com/netdata/netdata/pull/5795) ([ilyam8](https://github.com/ilyam8)) - netdata supports sending its logs to Syslog, but the facility was hard-coded. We now support configurable Syslog facilities in `netdata.conf`. [\#5792](https://github.com/netdata/netdata/pull/5792) ([thiagoftsm](https://github.com/thiagoftsm)) - We encountered sporadic failures of our kickstart installation scripts after nightly releases. We add integrity tests to our pipeline to ensure we prevent faulty scripts from getting deployed. [\#5778](https://github.com/netdata/netdata/pull/5778) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - [Kubernetes Helm Chart](https://github.com/netdata/helmchart/) improvements: ([cakrit](https://github.com/cakrit)) and ([varyumin](https://github.com/varyumin)). - Added serviceName in statefulset spec to align with the k8s documentation - Added preStart command to persist slave machine GUIDs, so that pod deletion/addition during upgrades doesn't lose the slave history. - Disabled non-essential master netdata collector plugins to avoid duplicate data - Added preStop command to wait for netdata to exit gracefully before removing the container - Extended configuration file support to provide more control from the helm command line - Added option to disable Role-based access control - Added liveness and readiness probes. 2019-04-26T07:38:02+00:00 netdata v1.15.0 netdata v1.15.0 2019-05-21T08:15:39+00:00 Release v1.15.0 contains 11 bug fixes and 30 improvements. ### At a glance We are very happy and proud to be able to include two major improvements in this release: The aggregated node view and the [new database engine](https://docs.netdata.cloud/database/engine/). #### Aggregated node view The No. 1 request from our community has been a better way to view and manage their Netdata installations, via an aggregated view. The node menu with the simple list of hosts on the agent UI just didn't do it for people with hundreds, or thousands of instances. This release introduces the node view, which uses the power of [Netdata Cloud](https://blog.netdata.cloud/posts/netdata-cloud-announcement/) to deliver powerful views of a Netdata-based monitoring infrastructure. ![Screenshot from 2019-05-17 19-57-58](https://user-images.githubusercontent.com/43294513/57947790-27f6bd80-78e0-11e9-9d23-ab969672df1f.png) You can read more about Netdata Cloud and the future of netdata [here](https://blog.netdata.cloud/posts/netdata-cloud-announcement/). #### New database engine Historically, Netdata has required a lot of memory for long-term metrics storage. To mitigate this we've been building a new DB engine for several months and will continue improving until it can become the default `memory mode` for new Netdata installations. The version included in release v1.15.0 already permits longer-term storage of compressed data and we'll continue reducing the required memory in following releases. #### Other major additions We have added support for the [AWS Kinesis backend](https://docs.netdata.cloud/backends/aws_kinesis/) and new collectors for [OpenVPN](https://docs.netdata.cloud/collectors/go.d.plugin/modules/openvpn/), the [Tengine web server](https://docs.netdata.cloud/collectors/go.d.plugin/modules/tengine/), [ScaleIO (VxFlex OS)](https://docs.netdata.cloud/collectors/go.d.plugin/modules/scaleio/), [ioping-like latency metrics](https://docs.netdata.cloud/collectors/ioping.plugin/) and [Energi Core node instances](https://docs.netdata.cloud/collectors/python.d.plugin/energid/). We now have a new, ["text-only" chart type](https://github.com/netdata/netdata/issues/5578), [cpu limits for v2 cgroups](https://github.com/netdata/netdata/issues/5850), [docker swarm metrics](https://docs.netdata.cloud/collectors/go.d.plugin/modules/docker_engine/) and improved [documentation](https://docs.netdata.cloud/). We continued improving the [Kubernetes helmchart](https://github.com/netdata/helmchart) with liveness probes for slaves, persistence options, a fix for a `Cannot allocate memory` issue and easy configuration for the kubelet, kube-proxy and coredns collectors. Finally, we built a process to quickly replace any problematic nightly builds and added more automated CI tests to prevent such builds from being published in the first place. ### Acknowledgements Our heartfelt gratitude for this release goes to the following people: - @kam1kaze for help with Kubernetes, a fix for the Docker image and documentation improvements. - @andvgal for the Energi Core daemon collector and the improvement of the python.d plugin. - @skrzyp1 for improving cgroup monitoring. - @Daniel15 for the much sought-after "text-only" new chart type. - @Fohdeesha, @SahAssar, and @smonff for improving the documentation. - @etienne-napoleone, @karuppiah7890 and @varyumin for their contributions to the Kubernetes helm chart. ### Improvements - Support for aggregate node view [\#5902](https://github.com/netdata/netdata/pull/5902) ([gmosx](https://github.com/gmosx)) - Database engine [\#5282](https://github.com/netdata/netdata/pull/5282) ([mfundul](https://github.com/mfundul)) - New collector modules: - Go.d collectors for [OpenVPN](https://github.com/netdata/go.d.plugin/tree/master/modules/openvpn), the [Tengine web server](https://github.com/netdata/go.d.plugin/tree/master/modules/tengine) and [ScaleIO (VxFlex OS) instances](https://github.com/netdata/go.d.plugin/tree/master/modules/scaleio) ([ilyam8](https://github.com/ilyam8)) - Monitor disk access latency like ioping does [\#5725](https://github.com/netdata/netdata/pull/5725) ([vlvkobal](https://github.com/vlvkobal)) - Energi Core daemon monitoring, suits other Bitcoin forks [\#5894](https://github.com/netdata/netdata/pull/5894) ([andvgal](https://github.com/andvgal)) - Collector improvements: - Add docker swarm manager metrics to the go.d [docker_engine collector](https://github.com/netdata/go.d.plugin/tree/master/modules/docker_engine) ([ilyam8](https://github.com/ilyam8)) - Implement unified cgroup cpu limit [\#5895](https://github.com/netdata/netdata/pull/5895) ([skrzyp1](https://github.com/skrzyp1)) - python.d.plugin: Allow monitoring of HTTP(S) endpoints which require POST data and make the UrlService more flexible [\#5893](https://github.com/netdata/netdata/pull/5893) ([andvgal](https://github.com/andvgal)) - Support the AWS Kinesis backend for long-term storage [\#5914](https://github.com/netdata/netdata/pull/5914) ([vlvkobal](https://github.com/vlvkobal)) - Add a new "text-only" chart renderer [\#5971](https://github.com/netdata/netdata/pull/5971) ([Daniel15](https://github.com/Daniel15)) - Packaging and CI improvements: - We can now fix more quickly any problematic published builds via a new manual deployment procedure [\#5899](https://github.com/netdata/netdata/pull/5899) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - We added more tests to our nightly builds, to catch more errors before publishing images [\#5918](https://github.com/netdata/netdata/pull/5918) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - API Improvements: - Smarter caching of API calls. Do not cache `alarms` and `info` api calls and extend no-cache headers. [\#5999](https://github.com/netdata/netdata/pull/5999) ([cakrit](https://github.com/cakrit)) - Extend the `api/v1/info` call response with system and collector information [\#5889](https://github.com/netdata/netdata/pull/5889) & [\#5891](https://github.com/netdata/netdata/pull/5891) ([cakrit](https://github.com/cakrit)), [\#5996](https://github.com/netdata/netdata/pull/5996) ([vlvkobal](https://github.com/vlvkobal)) - k6 script for API load testing [\#5892](https://github.com/netdata/netdata/pull/5892) ([cakrit](https://github.com/cakrit)) - Kubernetes helmchart improvements: - Added the init container, where sysctl params could be managed, to bypass the `Cannot allocate memory` issue [#18](https://github.com/netdata/helmchart/pull/18) ([kam1kaze](https://github.com/kam1kaze)) - Better startup/shutdown of slaves and reduced memory usage with liveness/readiness probes and default memory mode none [#19](https://github.com/netdata/helmchart/pull/19) ([cakrit](https://github.com/cakrit)) - Added the option of overriding the default settings for kubelet, kubeproxy and coredns collectors via values.yaml [#24](https://github.com/netdata/helmchart/pull/24) ([cakrit](https://github.com/cakrit)) - Make the use of persistent volumes optional, add `apiVersion` to fix linting errors and correct the location of the `env` field [#22](https://github.com/netdata/helmchart/pull/22), [#23](https://github.com/netdata/helmchart/pull/23) ([karuppiah7890](https://github.com/karuppiah7890)) - Fix incorrect parameter names in the README [#24](https://github.com/netdata/helmchart/pull/24) ([etienne-napoleone](https://github.com/etienne-napoleone)) - Documentation improvements: - [\#5936](https://github.com/netdata/netdata/pull/5936) ([smonff](https://github.com/smonff)) - [\#5974](https://github.com/netdata/netdata/pull/5974) ([SahAssar](https://github.com/SahAssar)) - [\#5992](https://github.com/netdata/netdata/pull/5992), [\#6024](https://github.com/netdata/netdata/pull/6024) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - [\#6016](https://github.com/netdata/netdata/pull/6016), [\#6029](https://github.com/netdata/netdata/pull/6029), [\#6030](https://github.com/netdata/netdata/pull/6030), [\#6032](https://github.com/netdata/netdata/pull/6032) ([cakrit](https://github.com/cakrit)) - [\#5982](https://github.com/netdata/netdata/pull/5982) ([Fohdeesha](https://github.com/Fohdeesha)) - [\#5980](https://github.com/netdata/netdata/pull/5980) ([kam1kaze](https://github.com/kam1kaze)) ### Bug fixes - Prowl notifications were not being sent, unless another notification method was also active [\#6022](https://github.com/netdata/netdata/pull/6022) ([cakrit](https://github.com/cakrit)) - Fix exception handling in the python.d plugin [\#5997](https://github.com/netdata/netdata/pull/5997) ([ilyam8](https://github.com/ilyam8)) - The `node` applications group did not include all node processes. [\#5962](https://github.com/netdata/netdata/pull/5962) ([jonfairbanks](https://github.com/jonfairbanks)) - Installation would show incorrect message "FAILED Cannot install netdata init service." in some cases [\#5947](https://github.com/netdata/netdata/pull/5947) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - The nvidia\_smi collector displayed incorrect power usage [\#5940](https://github.com/netdata/netdata/pull/5940) ([ilyam8](https://github.com/ilyam8)) - The python.d plugin would sometimes hang, because it lacked a connect timeout [\#5911](https://github.com/netdata/netdata/pull/5911) ([ilyam8](https://github.com/ilyam8)) - The mongodb collector raised errors due to various KeyErrors [\#5931](https://github.com/netdata/netdata/pull/5931) ([ilyam8](https://github.com/ilyam8)) - The smartd\_log collector would show incorrect temperature values [\#5923](https://github.com/netdata/netdata/pull/5923) ([ilyam8](https://github.com/ilyam8)) - charts.d plugins would fail on docker, when using the `timeout` command [\#5938](https://github.com/netdata/netdata/pull/5938) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Docker image had plugins not executable by user netdata [\#5917](https://github.com/netdata/netdata/pull/5917) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Docker image was missing the `lsns` command, used to match network interfaces to containers [#1](https://github.com/netdata/helper-images/pull/1) ([kam1kaze](https://github.com/kam1kaze)) 2019-05-21T08:15:39+00:00 netdata v1.16.0 netdata v1.16.0 2019-07-08T19:15:29+00:00 Release v1.16.0 contains 40 bug fixes, 31 improvements and 20 documentation updates ### At a glance **Binary distributions.** To improve the security, speed and reliability of new netdata installations, we are delivering our own, industry standard installation method, with binary package distributions. The RPM binaries for the most common OSs are already available on packagecloud and we’ll have the DEB ones available very soon. All distributions are considered in Beta and, as always, we depend on our amazing community for feedback on improvements. - Our stable distributions are at [netdata/netdata @ packagecloud.io](https://packagecloud.io/netdata/netdata) - The nightly builds are at [netdata/netdata-edge @ packagecloud.io](https://packagecloud.io/netdata/netdata-edge) **Netdata now supports SSL encryption!** You can secure the communication to the [web server](https://docs.netdata.cloud/web/server/#enabling-tls-support), the [streaming connections from slaves to the master](https://docs.netdata.cloud/streaming/#securing-the-communication) and the connection to an [openTSDB backend](https://docs.netdata.cloud/backends/opentsdb/#https). **This version also brings two long-awaited features to netdata’s health monitoring:** - The [health management API](https://docs.netdata.cloud/web/api/health/#health-management-api) introduced in v1.12 allowed you to easily disable alarms and/or notifications while netdata was running. However, those changes were not persisted across netdata restarts. Since part of routine maintenance activities may involve completely restarting a monitoring node, netdata now saves these configurations to disk, every time you issue a command to change the silencer settings. The new [LIST command](https://docs.netdata.cloud/web/api/health/#list-silencers) of the API allows you to view at any time which alarms are currently disabled or silenced. - A way for netdata to [repeatedly send alarm notifications](https://docs.netdata.cloud/health/#alarm-line-repeat) for some, or all active alarms, at a frequency of your choosing. As a result, you will no longer have to worry about missing a notification, forgetting about a raised alarm. The default is still to only send a single notification, so that existing users are not surprised by a different behavior. As always, we’ve introduced new collectors, 5 of them this time. - Of special interest to people with Windows servers in their infrastructure is the [WMI collector](https://docs.netdata.cloud/collectors/go.d.plugin/modules/wmi/), though we are fully aware that we need to continue our efforts to do a proper port to Windows. - The new `perf` plugin collects system-wide CPU performance statistics from Performance Monitoring Units (PMU) using the `perf_event_open()` system call. You can read a wonderful article on why this is useful [here](http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html). - The other three are collectors to monitor [Dnsmasq DHCP leases](https://docs.netdata.cloud/collectors/go.d.plugin/modules/dnsmasq_dhcp/), [Riak KV servers](https://docs.netdata.cloud/collectors/python.d.plugin/riakkv/) and [Pihole instances](https://docs.netdata.cloud/collectors/go.d.plugin/modules/pihole/). Finally, the DB Engine introduced in v1.15.0 now uses much less memory and is more robust than before. ### Acknowledgements As you’ll see in the detailed list below, once again we’ve had great help from our contributors. - [Steve8291](https://github.com/Steve8291) was helping everywhere - [apardyl](https://github.com/apardyl) added useful new alarms and helped with documentation - [jchristgit](https://github.com/jchristgit) wrote the Riak KV collector - [Saruspete](https://github.com/Saruspete) made improvements to the freeipmi plugin - [kam1kaze](https://github.com/kam1kaze) has added new charts to the python mysql collector - [akwan](https://github.com/akwan) and [mbarper](https://github.com/mbarper) improved the application monitoring, with new process groupings - [nodiscc](https://github.com/nodiscc) helped with bug and documentation fixes - [dankohn](https://github.com/dankohn)) helped with the documentation - [andvgal](https://github.com/andvgal) added an amazing configuration to help us run proper lint checks on our markdown files - [octomike](https://github.com/octomike), [Danamir](https://github.com/Danamir), [mbarper](https://github.com/mbarper), [Wing924](https://github.com/Wing924), [n0coast](https://github.com/n0coast) and [toofar](https://github.com/toofar) delivered bug fixes - [josecv](https://github.com/josecv) helped improve the Kubernetes helm chart. We can't stress enough the immense help we get just from users creating an issue in GitHub, helping us identify the root cause and validate the change in their infrastructure. Unfortunately, we are not able to list all of them here, but their contribution is invaluable. ### Improvements #### Binary packages - Introduced automatic binary packages generation and delivery for RPM types \(Phase 1\) [\#6223](https://github.com/netdata/netdata/pull/6223) [\#6369](https://github.com/netdata/netdata/pull/6369) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) #### Health - Easily disable alarms, by persisting the silencers configuration [\#6274](https://github.com/netdata/netdata/pull/6274) [\#6360](https://github.com/netdata/netdata/pull/6360) ([thiagoftsm](https://github.com/thiagoftsm)) - Repeating alarm notifications [\#6309](https://github.com/netdata/netdata/pull/6309) ([thiagoftsm](https://github.com/thiagoftsm)) and ([kamcpp](https://github.com/kamcpp)) - Simplified the health cmdapi tester - no setup/cleanup needed [\#6210](https://github.com/netdata/netdata/pull/6210) ([cakrit](https://github.com/cakrit)) - Αdd last\_collected alarm to the x509check collector [\#6139](https://github.com/netdata/netdata/pull/6139) ([ilyam8](https://github.com/ilyam8)) - New alarm for abnormally high number of active processes. [\#6116](https://github.com/netdata/netdata/pull/6116) ([apardyl](https://github.com/apardyl)) #### Security - SSL support in the web server and streaming/replication [\#5956](https://github.com/netdata/netdata/pull/5956) ([thiagoftsm](https://github.com/thiagoftsm)) - Support encrypted connections to OpenTSDB backends [\#6220](https://github.com/netdata/netdata/pull/6220) ([thiagoftsm](https://github.com/thiagoftsm)) - Show the security policy directly from GitHub [\#6163](https://github.com/netdata/netdata/pull/6163) [\#6166](https://github.com/netdata/netdata/pull/6166) ([cakrit](https://github.com/cakrit)) #### New collectors - Go.d collector modules for [WMI](https://github.com/netdata/go.d.plugin/tree/master/modules/wmi), [Dnsmasq DHCP leases)(https://github.com/netdata/go.d.plugin/tree/master/modules/dnsmasq_dhcp) and [Pihole](https://github.com/netdata/go.d.plugin/tree/master/modules/pihole) ([ilyam8](https://github.com/ilyam8)) - Riak KV instances collector [\#6286](https://github.com/netdata/netdata/pull/6286) ([jchristgit](https://github.com/jchristgit)) - CPU performance statistics using Performance Monitoring Units (PMU) via the `perf_event_open()` system call. (perf plugin) [\#6225](https://github.com/netdata/netdata/pull/6225) ([vlvkobal](https://github.com/vlvkobal)) #### Collector improvements - Handle different sensor IDs for the same element in the freeipmi plugin [\#6296](https://github.com/netdata/netdata/pull/6296) ([Saruspete](https://github.com/Saruspete)) - Increase the cpu\_limit chart precision in cgroup plugin [\#6172](https://github.com/netdata/netdata/pull/6172) ([vlvkobal](https://github.com/vlvkobal)) - Added `userstats` and `deadlocks` charts to the python mysql collector [\#6118](https://github.com/netdata/netdata/pull/6118) [\#6115](https://github.com/netdata/netdata/pull/6115) ([kam1kaze](https://github.com/kam1kaze)) - Add perforce server process monitoring to the apps plugin [\#6064](https://github.com/netdata/netdata/pull/6064) ([akwan](https://github.com/akwan)) #### Backends - Prometheus remote write backend [\#6062](https://github.com/netdata/netdata/pull/6062) ([vlvkobal](https://github.com/vlvkobal)) #### DB engine improvements - Reduced memory requirements by 40-50% [\#6134](https://github.com/netdata/netdata/pull/6134) ([mfundul](https://github.com/mfundul)) - Reduced the number of pages needed to be stored and indexed when using `memory mode = dbengine`, by adding empty page detection [\#6173](https://github.com/netdata/netdata/pull/6173) ([mfundul](https://github.com/mfundul)) #### Rebranding - Updated the netdata logo and changed links to point to the new website [\#6359](https://github.com/netdata/netdata/pull/6359) [\#6398](https://github.com/netdata/netdata/pull/6398) ([cakrit](https://github.com/cakrit)), [\#6396](https://github.com/netdata/netdata/pull/6396) ([ivorjvr](https://github.com/ivorjvr)), [\#6389](https://github.com/netdata/netdata/pull/6389) ([joelhans](https://github.com/joelhans)) #### Documentation - Improve documentation about file descriptors and systemd configuration. [\#6372](https://github.com/netdata/netdata/pull/6372) ([mfundul](https://github.com/mfundul)) - Update the documentation on charts with zero metrics [\#6314](https://github.com/netdata/netdata/pull/6314) ([vlvkobal](https://github.com/vlvkobal)) - Document that that in versions before 1.16, the plugins.d directory may be installed in a different location in certain OSs [\#6301](https://github.com/netdata/netdata/pull/6301) ([cakrit](https://github.com/cakrit)) - Remove single and multi-threaded web server configuration instructions [\#6291](https://github.com/netdata/netdata/pull/6291) ([nodiscc](https://github.com/nodiscc)) - Add more info on the `stream.conf` option `health enabled by default = auto` [\#6281](https://github.com/netdata/netdata/pull/6281) ([cakrit](https://github.com/cakrit)) - Add comments about AWS SDK for C++ installation [\#6277](https://github.com/netdata/netdata/pull/6277) ([vlvkobal](https://github.com/vlvkobal)) - Fix on the installation readme regarding the supported systems (first came RedHat, then the others) [\#6271](https://github.com/netdata/netdata/pull/6271) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Update the new dbengine documentation [\#6264](https://github.com/netdata/netdata/pull/6264) ([mfundul](https://github.com/mfundul)) - Remove CNCF logo and TOC presentation reference [\#6234](https://github.com/netdata/netdata/pull/6234) ([dankohn](https://github.com/dankohn)) - Added code style guidance to CONTRIBUTING [\#6212](https://github.com/netdata/netdata/pull/6212) ([cakrit](https://github.com/cakrit)) - Visibility fix for anonymous statistics [\#6208](https://github.com/netdata/netdata/pull/6208) ([cakrit](https://github.com/cakrit)) - smartd documentation improvements [\#6207](https://github.com/netdata/netdata/pull/6207) ([cakrit](https://github.com/cakrit)), [\#6203](https://github.com/netdata/netdata/pull/6203) ([Steve8291](https://github.com/Steve8291)) - Made custom notification's instructions clearer [\#6181](https://github.com/netdata/netdata/pull/6181) ([cakrit](https://github.com/cakrit)) - Fix typo in the web server README [\#6146](https://github.com/netdata/netdata/pull/6146) ([cakrit](https://github.com/cakrit)) - Registry documentation fixes [\#6144](https://github.com/netdata/netdata/pull/6144) ([cakrit](https://github.com/cakrit)) - Changed 'netdata' to 'Netdata' in /docs/ and /README.md [\#6137](https://github.com/netdata/netdata/pull/6137) ([apardyl](https://github.com/apardyl)) - Update installer readme with OpenSUSE dependencies [\#6111](https://github.com/netdata/netdata/pull/6111) ([mfundul](https://github.com/mfundul)) - Fixed minor typos in the daemon configuration documentation [\#6090](https://github.com/netdata/netdata/pull/6090) ([Steve8291](https://github.com/Steve8291)) - Mention anonymous statistics in additional places in the docs [\#6084](https://github.com/netdata/netdata/pull/6084) ([cakrit](https://github.com/cakrit)) - Local remark-lint checks and autofix support [\#5898](https://github.com/netdata/netdata/pull/5898) ([andvgal](https://github.com/andvgal)) #### Other - Pass the the `cloud base url` parameter to the notifications mechanism, so that modifications to the configuration are respected when creating the link to the alarm [\#6383](https://github.com/netdata/netdata/pull/6383) ([ladakis](https://github.com/ladakis)) - Added a `.gitattributes` file to improve `git diff` for C files [\#6381](https://github.com/netdata/netdata/pull/6381) ([ac000](https://github.com/ac000)) - Improved logging, to be able to trace the `CRITICAL: main[main] SIGPIPE received.` error [\#6373](https://github.com/netdata/netdata/pull/6373) ([vlvkobal](https://github.com/vlvkobal)) - Modify the limits of the stale bot, to close stale questions/discussions in GitHub faster [\#6297](https://github.com/netdata/netdata/pull/6297) ([ilyam8](https://github.com/ilyam8)) - Internal CI/CD improvements [\#6282](https://github.com/netdata/netdata/pull/6282) [\#6268](https://github.com/netdata/netdata/pull/6268) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - netdata/packaging: Add more distribution validations [\#6235](https://github.com/netdata/netdata/pull/6235) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Move call to send\_statistics later, to get more telemetry events from docker containers [\#6113](https://github.com/netdata/netdata/pull/6113) ([vlvkobal](https://github.com/vlvkobal)), [\#6096](https://github.com/netdata/netdata/pull/6096) ([cakrit](https://github.com/cakrit)) - Use github templating mechanisms to classify issues when they are created [\#5776](https://github.com/netdata/netdata/pull/5776) ([paulfantom](https://github.com/paulfantom)) ### Bug fixes - Fixed `ram_available` alarm [\#6261](https://github.com/netdata/netdata/pull/6261) ([octomike](https://github.com/octomike)) - Stop monitoring `/dev` and `/run` in the disk space and inode usage charts [\#6399](https://github.com/netdata/netdata/pull/6399) ([vlvkobal](https://github.com/vlvkobal)) - Fixed the monitoring of the “time” group of processes [\#6397](https://github.com/netdata/netdata/pull/6397) ([mbarper](https://github.com/mbarper)) - Fixed compilation error `PERF_COUNT_HW_REF_CPU_CYCLES' undeclared here` in old Linux kernels (perf plugin) [\#6382](https://github.com/netdata/netdata/pull/6382) ([vlvkobal](https://github.com/vlvkobal)) - Fixed autodetection for openldap on Debian (apps.plugin) [\#6364](https://github.com/netdata/netdata/pull/6364) ([nodiscc](https://github.com/nodiscc)) - Fixed compilation error on CentOS 6 (nfacct plugin) [\#6351](https://github.com/netdata/netdata/pull/6351) ([vlvkobal](https://github.com/vlvkobal)) - Fixed invalid XML page error (tomcat plugin) [\#6345](https://github.com/netdata/netdata/pull/6345) ([Danamir](https://github.com/Danamir)) - Remove obsolete monit metrics [\#6340](https://github.com/netdata/netdata/pull/6340) ([ilyam8](https://github.com/ilyam8)) - Fixed `Failed to parse` error in adaptec\_raid [\#6338](https://github.com/netdata/netdata/pull/6338) ([ilyam8](https://github.com/ilyam8)) - Fixed `cluster_health_nodes` and `cluster_stats_nodes` charts in the elasticsearch collector [\#6311](https://github.com/netdata/netdata/pull/6311) ([Wing924](https://github.com/Wing924)) - A modified slave chart's "name" was not properly transferred to the master (streaming) [\#6304](https://github.com/netdata/netdata/pull/6304) ([vlvkobal](https://github.com/vlvkobal)) - Netdata could run out of file descriptors when using the new DB engine [\#6303](https://github.com/netdata/netdata/pull/6303) ([mfundul](https://github.com/mfundul)) - Fixed UI behavior when pressing the `End` key [\#6294](https://github.com/netdata/netdata/pull/6294) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed UI link to check the configuration file, to open in a new tab [\#6294](https://github.com/netdata/netdata/pull/6294) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed files not found during installation, due to different than expected location of the `libexecdir` directory [\#6272](https://github.com/netdata/netdata/pull/6272) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Prevented `Error: 'module' object has no attribute 'Retry'` messages from python collectors, by enforcing minimum version check for the `UrlService` library [\#6263](https://github.com/netdata/netdata/pull/6263) ([ilyam8](https://github.com/ilyam8)) - Fixed typo that causes nfacct.plugin log messages to incorrectly show `freeipmi` [\#6260](https://github.com/netdata/netdata/pull/6260) ([vlvkobal](https://github.com/vlvkobal)) - Fixed netdata/netdata docker image failure, when users pass a PGID that already exists on the system [\#6259](https://github.com/netdata/netdata/pull/6259) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - The daemon could get stuck during collection or during shutdown, when using the new dbengine. Reduced new dbengine IO utilization by forcing page alignment per dimension of chart. [\#6240](https://github.com/netdata/netdata/pull/6240) ([mfundul](https://github.com/mfundul)) - Properly handle timeouts/no response in dns\_query\_time python collector [\#6237](https://github.com/netdata/netdata/pull/6237) ([n0coast](https://github.com/n0coast)) - When a collector restarted after having stopped for a long time, the new dbengine would consume a lot of CPU resources. [\#6216](https://github.com/netdata/netdata/pull/6216) ([mfundul](https://github.com/mfundul)) - Fixed error `Assertion `old_state & PG_CACHE_DESCR_ALLOCATED' failed` of the new dbengine. Eliminated a page cache descriptor race condition [\#6202](https://github.com/netdata/netdata/pull/6202) ([mfundul](https://github.com/mfundul)) - tv.html failed to load the three left charts when accessed via https. Turn tv.html links to https [\#6198](https://github.com/netdata/netdata/pull/6198) ([cakrit](https://github.com/cakrit)) - Change print level from error to info for messages about clearing old files from the database[\#6195](https://github.com/netdata/netdata/pull/6195) ([mfundul](https://github.com/mfundul)) - Fixed warning regarding the x509check\_last\_collected\_secs alarms. Changed the template update frequency to 60s, to match the chart’s update frequency [\#6194](https://github.com/netdata/netdata/pull/6194) ([ilyam8](https://github.com/ilyam8)) - Email notification header lines were not terminated with `\r\n` as per the RFC [\#6187](https://github.com/netdata/netdata/pull/6187) ([toofar](https://github.com/toofar)) - Some log entries would not be caught by the python web_log plugin. Fixed the regular expressions [\#6138](https://github.com/netdata/netdata/pull/6138) [\#6180](https://github.com/netdata/netdata/pull/6180) ([ilyam8](https://github.com/ilyam8)) - Corrected the date used in pushbullet notifications [\#6179](https://github.com/netdata/netdata/pull/6179) ([cakrit](https://github.com/cakrit)) - Fixed FATAL error when using the new dbengine with no direct I/O support, by falling back to buffered I/O [\#6174](https://github.com/netdata/netdata/pull/6174) ([mfundul](https://github.com/mfundul)) - Fixed compatibility issues with varnish v4 (varnish collector) [\#6168](https://github.com/netdata/netdata/pull/6168) ([ilyam8](https://github.com/ilyam8)) - The total number of disks in mdstat.XX_disks chart was displayed incorrectly. Fixed the "inuse" and "down" disks stacking. [\#6164](https://github.com/netdata/netdata/pull/6164) ([vlvkobal](https://github.com/vlvkobal)) - The config option --disable-telemetry was being checked after restarting netdata, which means that we would still send anonymous statistics the first time netdata was started. [\#6127](https://github.com/netdata/netdata/pull/6127) ([cakrit](https://github.com/cakrit)) - Fixed apcupsd collector errors, by passing correct info to the run function. [\#6126](https://github.com/netdata/netdata/pull/6126) ([Steve8291](https://github.com/Steve8291)) - apcupsd and libreswan were not enabled by default [\#6120](https://github.com/netdata/netdata/pull/6120) ([Steve8291](https://github.com/Steve8291)) - Fixed incorrect module name: energi to energid [\#6112](https://github.com/netdata/netdata/pull/6112) ([Steve8291](https://github.com/Steve8291)) - The nodes view did not work properly when a reverse proxy was configured to access netdata via paths containing subpaths (e.g. myserver/netdata) [\#6093](https://github.com/netdata/netdata/pull/6093) ([gmosx](https://github.com/gmosx)) - Fix error message `PLUGINSD : cannot open plugins directory` [\#6080](https://github.com/netdata/netdata/pull/6080) [\#6089](https://github.com/netdata/netdata/pull/6089) ([Steve8291](https://github.com/Steve8291)) - Corrected invalid links to web\_log.conf that appear on the agent UI [\#6087](https://github.com/netdata/netdata/pull/6087) ([cakrit](https://github.com/cakrit)) - Fixed ScaleIO collector endpoint paths [go.d PR 226](https://github.com/netdata/go.d.plugin/pull/226) [ilyam8](https://github.com/ilyam8) - Fixed web client timeout handling in the go.d plugin httpcheck collector [ go.d PR 225](https://github.com/netdata/go.d.plugin/pull/225) [ilyam8](https://github.com/ilyam8) 2019-07-08T19:15:29+00:00 netdata v1.17.0 netdata v1.17.0 2019-09-03T09:57:55+00:00 Release v1.17.0 contains 38 bug fixes, 33 improvements, and 20 documentation updates. ## At a glance You can now change the data collection frequency at will, without losing previously collected values. A major improvement to the new database engine allows you not only to store metrics at variable granularity, but also to autoscale the time axis of the charts, depending on the data collection frequencies used during the presented time. You can also now monitor VM performance from one or more vCenter servers with a new [VSphere collector](https://docs.netdata.cloud/collectors/go.d.plugin/modules/vsphere/). In addition, the `proc` plugin now also collects ZRAM device performance metrics and the `apps` plugin monitors process uptime for the defined process groups. Continuing our efforts to integrate with as many existing solutions as possible, you can now directly archive metrics from Netdata to MongoDB via a new backend. Netdata badges now support international (UTF8) characters! We also made our URL parser smarter, not only for international character support, but also for other strange API queries. We also added `.DEB` packages to our binary distribution repositories at [Packagecloud](https://packagecloud.io/netdata), a new collector for Linux zram device metrics, and support for plain text email notifications. This release includes several fixes and improvements to the TLS encryption feature we introduced in v1.16.0. First, encryption slave-to-master streaming connections wasn't working as intended. And second, our community helped us discover cases where HTTP requests were not correctly redirected to HTTPS with TLS enabled. This release mitigates those issues and improves TLS support overall. Finally, we improved the way Netdata displays charts with no metrics. By default, Netdata displays charts for disks, memory, and networks only when the associated metrics are not zero. Users could enable these charts permanently using the corresponding configuration options, but they would need to change more than 200 options. With this new improvement, users can enable all charts with zero values using a single, global configuration parameter. ## Acknowledgements Our thanks go to: - [Steve8291](https://github.com/Steve8291) for all his help across the board! - [alpes214](https://github.com/alpes214) for improvements in health monitoring - [fun04wr0ng](https://github.com/fun04wr0ng) for fixing a bug in the `nfacct` plugin - [RaZeR-RBI](https://github.com/RaZeR-RBI) for the ZRAM collector module - [underhood](https://github.com/underhood) for the UTF-8 parsing fixes in badges, that gave us support for internationalized badges - Ferroin](<https://github.com/Ferroin>) for improving the python.d collectors handling of disconnected sockets - [dex4er](https://github.com/dex4er) for improving our OS detection code - [knatsakis](https://github.com/knatsakis) for his help in our CI/CD pipeline - [sunflowerbofh](https://github.com/sunflowerbofh) for `.gitignore` fixes - [Cat7373](https://github.com/Cat7373) for fixing some issues with the `spigotmc` collector ## Improvements ### Database engine - Variable granularity support for data collection [#6430](https://github.com/netdata/netdata/pull/6430) ([mfundul](https://github.com/mfundul)) - Added tips on the UI to encourage users to try the new DB Engine, when they reach the end of their metrics history [#6711](https://github.com/netdata/netdata/pull/6711) ([jacekkolasa](https://github.com/jacekkolasa)) ### Binary packages - Added nightly generation of RPM/DEB amd64 packages [#6675](https://github.com/netdata/netdata/pull/6675) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Provided built-in support for the prometheus remote write API in our packages [#6480](https://github.com/netdata/netdata/pull/6480) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Documented distribution support matrix and functionality availability [#6552](https://github.com/netdata/netdata/pull/6552) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) ### Health - Added support for plain text only email notifications [#6485](https://github.com/netdata/netdata/pull/6485) ([leo-lb](https://github.com/leo-lb)) - Started showing “hidden” alarm variables in the responses of the `chart` and `data` API calls (#6054) [#6615](https://github.com/netdata/netdata/pull/6615) ([alpes214](https://github.com/alpes214)) - Added a new API call for alarm status counters, as a first step towards badges that will show the total number of alarms [#6554](https://github.com/netdata/netdata/pull/6554) ([alpes214](https://github.com/alpes214)) ### Security - Added configurable default locations for trusted CA certificates [#6549](https://github.com/netdata/netdata/pull/6549) ([thiagoftsm](https://github.com/thiagoftsm)) - Added safer way to get container names [#6441](https://github.com/netdata/netdata/pull/6441) ([ViViDboarder](https://github.com/ViViDboarder)) - Added SSL connection support to the python mongodb collector [#6546](https://github.com/netdata/netdata/pull/6546) ([ilyam8](https://github.com/ilyam8)) ### New collectors - VSphere collector [go.d.plugin PR241](https://github.com/netdata/go.d.plugin/pull/241) [#6572](https://github.com/netdata/netdata/pull/6572) ([ilyam8](https://github.com/ilyam8)) ### Collector improvements - rethinkdb collector new driver support [#6431](https://github.com/netdata/netdata/pull/6431) ([ilyam8](https://github.com/ilyam8)) - The apps plugin now displays process uptime charts [#6654](https://github.com/netdata/netdata/pull/6654) ([vlvkobal](https://github.com/vlvkobal)) - Added ZRAM device metrics to the `proc.plugin` [#6276](https://github.com/netdata/netdata/pull/6276) [#6424](https://github.com/netdata/netdata/pull/6424) ([RaZeR-RBI](https://github.com/RaZeR-RBI)) ### Archiving - Added a new MongoDB backend [#6524](https://github.com/netdata/netdata/pull/6524) ([vlvkobal](https://github.com/vlvkobal)) ### Documentation - Add a statement about permissions for the diskspace plugin [#6474](https://github.com/netdata/netdata/pull/6474) ([vlvkobal](https://github.com/vlvkobal)) - Improved the running behind Nginx guide [#6466](https://github.com/netdata/netdata/pull/6466) ([prhomhyse](https://github.com/prhomhyse)) - Add more supported backends to the documentation [#6443](https://github.com/netdata/netdata/pull/6443) ([vlvkobal](https://github.com/vlvkobal)) - Removed Ventureer from the list of demo sites [#6442](https://github.com/netdata/netdata/pull/6442) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Updated docs health monitoring and health management api documentation [#6435](https://github.com/netdata/netdata/pull/6435) ([jghaanstra](https://github.com/jghaanstra)) - Fixed issues in HTML docs generation, causing the hyperlink checks to function improperly [#6433](https://github.com/netdata/netdata/pull/6433) ([cakrit](https://github.com/cakrit)) - New 'homepage' for documentation site [#6428](https://github.com/netdata/netdata/pull/6428) ([joelhans](https://github.com/joelhans)) - Styling improvements to documentation [#6425](https://github.com/netdata/netdata/pull/6425) ([joelhans](https://github.com/joelhans)) - Add documentation for binary packages, plus draft table for distributions support [#6422](https://github.com/netdata/netdata/pull/6422) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Update netdata installation dependencies [#6421](https://github.com/netdata/netdata/pull/6421) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Added better explanation of nightly and stable releases [#6388](https://github.com/netdata/netdata/pull/6388) ([joelhans](https://github.com/joelhans)) - Add netdata haproxy documentation page [#6454](https://github.com/netdata/netdata/pull/6454) ([johnramsden](https://github.com/johnramsden)) - Added Netdata Cloud documentation [#6476](https://github.com/netdata/netdata/pull/6476) ([joelhans](https://github.com/joelhans)) - Removed text about nightly version [#6534](https://github.com/netdata/netdata/pull/6534) ([joelhans](https://github.com/joelhans)) - Provided documentation style guide & build instructions [#6563](https://github.com/netdata/netdata/pull/6563) ([joelhans](https://github.com/joelhans)) - Install Netdata with Docker [#6596](https://github.com/netdata/netdata/pull/6596) ([prhomhyse](https://github.com/prhomhyse)) - Fixed typos in: 'README.md' file. [#6604](https://github.com/netdata/netdata/pull/6604) ([coffeina](https://github.com/coffeina)) - Change "netdata" to "Netdata" in all docs [#6621](https://github.com/netdata/netdata/pull/6621) ([joelhans](https://github.com/joelhans)) - Fixed Markdown Lint warnings [#6664](https://github.com/netdata/netdata/pull/6664) ([prhomhyse](https://github.com/prhomhyse)) - Improved Apache reverse proxy documentation on Content Security Policy [#6667](https://github.com/netdata/netdata/pull/6667) ([sunflowerbofh](https://github.com/sunflowerbofh)) ### Other - Updated our CLA, clarifying our intention to keep netdata FOSS [#6504](https://github.com/netdata/netdata/pull/6504) ([cakrit](https://github.com/cakrit)) - Updated terms of use for U.S. legal reasons [#6631](https://github.com/netdata/netdata/pull/6631) ([cakrit](https://github.com/cakrit)) - Updated logos in the infographic and remaining favicons [#6417](https://github.com/netdata/netdata/pull/6417) ([cakrit](https://github.com/cakrit)) - SSL vs. TLS consistency and clarification in documentation [#6414](https://github.com/netdata/netdata/pull/6414) ([joelhans](https://github.com/joelhans)) - Update Running-behind-apache.md [#6406](https://github.com/netdata/netdata/pull/6406) ([Steve8291](https://github.com/Steve8291)) - Fix Web API Health documentation [#6404](https://github.com/netdata/netdata/pull/6404) ([thiagoftsm](https://github.com/thiagoftsm)) - Added apps grouping debug messages [#6375](https://github.com/netdata/netdata/pull/6375) ([vlvkobal](https://github.com/vlvkobal)) - GCC warning and linting improvements [#6392](https://github.com/netdata/netdata/pull/6392) ([ac000](https://github.com/ac000)) - Minor code readability changes [#6539](https://github.com/netdata/netdata/pull/6539) ([underhood](https://github.com/underhood)) - Added global configuration option to show charts with zero metrics [#6419](https://github.com/netdata/netdata/pull/6419) ([vlvkobal](https://github.com/vlvkobal)) - Improved the way we parse HTTP requests, so we can avoid issues from edge cases [#6247](https://github.com/netdata/netdata/pull/6247) [#6714](https://github.com/netdata/netdata/pull/6714) ([thiagoftsm](https://github.com/thiagoftsm)) - Build DEB and RPM packages in parallel [#6579](https://github.com/netdata/netdata/pull/6579) ([knatsakis](https://github.com/knatsakis)) - Updated package version requirements for LZ4 and libuv [#6607](https://github.com/netdata/netdata/pull/6607) ([mfundul](https://github.com/mfundul)) - Improved system OS detection for RHEL6 and Mac OS X [#6612](https://github.com/netdata/netdata/pull/6612) ([dex4er](https://github.com/dex4er)) - .travis.yml: Remove 'sudo: true' as it is now deprecated [#6624](https://github.com/netdata/netdata/pull/6624) ([knatsakis](https://github.com/knatsakis)) - Modified the documentation build process to accept \<> around links in markdown [#6646](https://github.com/netdata/netdata/pull/6646) ([cakrit](https://github.com/cakrit)) - Fixed spigotmc module typos in comments. [#6680](https://github.com/netdata/netdata/pull/6680) ([Cat7373](https://github.com/Cat7373)) ## Bug fixes - Fixed the snappy library detection in some versions of OpenSuSE and CentOS [#6479](https://github.com/netdata/netdata/pull/6479) ([vlvkobal](https://github.com/vlvkobal)) - Fixed sensor chips filtering in python sensors collector [#6463](https://github.com/netdata/netdata/pull/6463) ([ilyam8](https://github.com/ilyam8)) - Fixed user and group names in apps.plugin when running in a container, by mounting and reading `/etc/passwd` [#6472](https://github.com/netdata/netdata/pull/6472) ([vlvkobal](https://github.com/vlvkobal)) - Fixed possible buffer overflow in the JSON parser used for health notification silencers [#6460](https://github.com/netdata/netdata/pull/6460) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed handling of corrupted DB files in dbengine, that could cause netdata to not start properly (CRC and I/O error handling) [#6452](https://github.com/netdata/netdata/pull/6452) ([mfundul](https://github.com/mfundul)) - Stopped docs icon from linking to streaming page instead of docs root [#6445](https://github.com/netdata/netdata/pull/6445) ([joelhans](https://github.com/joelhans)) - Fixed an issue with Netdata snapshots that could sometimes cause a problem during import. [#6400](https://github.com/netdata/netdata/pull/6400) ([jacekkolasa](https://github.com/jacekkolasa)) - Fixed bug that would cause netdata to attempt to kill already terminated threads again, on shutdown. [#6387](https://github.com/netdata/netdata/pull/6387) ([emmrk](https://github.com/emmrk)) - Fixed out of memory (12) errors by reimplementing the myopen() function family [#6339](https://github.com/netdata/netdata/pull/6339) ([mfundul](https://github.com/mfundul)) - Fixed wrong redirection of users signing in after clicking Nodes [#6544](https://github.com/netdata/netdata/pull/6544) ([jacekkolasa](https://github.com/jacekkolasa)) - Fixed python.d smartd collector increasing CPU usage [#6540](https://github.com/netdata/netdata/pull/6540) ([ilyam8](https://github.com/ilyam8)) - Fixed missing navigation arrow in Documentation [#6533](https://github.com/netdata/netdata/pull/6533) ([joelhans](https://github.com/joelhans)) - Fixed mongodb python collector stock configuration mistake, by changing `password` to `pass` [#6518](https://github.com/netdata/netdata/pull/6518) ([ilyam8](https://github.com/ilyam8)) - Fixed broken left navbar links in translated docs [#6505](https://github.com/netdata/netdata/pull/6505) ([cakrit](https://github.com/cakrit)) - Fixed handling of UTF8 characters in badges and added International Support to the URL parser [#6426](https://github.com/netdata/netdata/pull/6426) ([underhood](https://github.com/underhood)) - Fixed nodes menu sizing (responsive) [#6455](https://github.com/netdata/netdata/pull/6455) ([builat](https://github.com/builat)) - Fixed issues with http redirection to https and streaming encryption [#6468](https://github.com/netdata/netdata/pull/6468) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed broken links to `arcstat.py` and `arc_summary.py` in dashboard_info.js [#6461](https://github.com/netdata/netdata/pull/6461) ([TheLovinator1](https://github.com/TheLovinator1)) - Fixed bug with the nfacct plugin that resulted in missing dimensions from the charts [#6098](https://github.com/netdata/netdata/pull/6098) ([fun04wr0ng](https://github.com/fun04wr0ng)) - Stopped anonymous stats from trying to write a log under `/tmp` [#6491](https://github.com/netdata/netdata/pull/6491) ([cakrit](https://github.com/cakrit)) - Fixed a problem with `edit-config`, the configuration editor, not being able to run in MacOS. We no longer deliver edit-config as part of the distribution tarball, so that it can get generated with proper configuration during installation .[#6507](https://github.com/netdata/netdata/pull/6507) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed issue with the netdata-updater that caused it not to run properly in static64 installations. [#6520](https://github.com/netdata/netdata/pull/6520) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed some yamllint errors in our Travis configuration [#6526](https://github.com/netdata/netdata/pull/6526) ([knatsakis](https://github.com/knatsakis)) - Properly delete obsolete dimensions for inactive disks in smartd_log [#6547](https://github.com/netdata/netdata/pull/6547) ([ilyam8](https://github.com/ilyam8)) - Fixed `.environment` file getting overwritten, by moving tarball checksum information into lib dir of netdata [#6555](https://github.com/netdata/netdata/pull/6555) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed handling of disconnected sockets in unbound python.d collector. [#6561](https://github.com/netdata/netdata/pull/6561) ([Ferroin](https://github.com/Ferroin)) - Fixed crash in malloc [#6583](https://github.com/netdata/netdata/pull/6583) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed installer error `undefined reference to LZ4_compress_default` [#6589](https://github.com/netdata/netdata/pull/6589) ([mfundul](https://github.com/mfundul)) - Fixed issue with mysql collector that resulted in showing only a single slave_status chart, regardless of the number of replication channels [#6597](https://github.com/netdata/netdata/pull/6597) ([ilyam8](https://github.com/ilyam8)) - Fixed installer issue that would automatically enable the netdata service, even, if it was previously disabled [#6606](https://github.com/netdata/netdata/pull/6606) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed a segmentation fault in backends [#6627](https://github.com/netdata/netdata/pull/6627) ([vlvkobal](https://github.com/vlvkobal)) - Fixed spigotmc plugin bugs [#6635](https://github.com/netdata/netdata/pull/6635) ([Cat7373](https://github.com/Cat7373)) - Fixed installer error when running `kickstart.sh` as a non-privileged user [#6642](https://github.com/netdata/netdata/pull/6642) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed issue causing OpenSSL libraries to not be found on gentoo [#6670](https://github.com/netdata/netdata/pull/6670) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed dbengine 100% CPU usage due to corrupted transaction payload handling [#6731](https://github.com/netdata/netdata/pull/6731) ([mfundul](https://github.com/mfundul)) - Fixed wrong default paths in certain installations [#6678](https://github.com/netdata/netdata/pull/6678) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed exact path to netdata.conf in .gitignore [#6709](https://github.com/netdata/netdata/pull/6709) ([sunflowerbofh](https://github.com/sunflowerbofh)) - Fixed static64 installer bug that resulted in always overwriting configuration [#6710](https://github.com/netdata/netdata/pull/6710) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) Thanks to the community for their help! 2019-09-03T09:57:55+00:00 netdata v1.17.1 netdata v1.17.1 2019-09-12T16:46:03+00:00 # Netdata v1.17.1 Release v1.17.1 contains 2 bug fixes, 6 improvements, and 2 documentation updates. ## At a glance The main reason for the patch release is an essential fix to the repeating alarm notifications we introduced in v1.17.0. If you enabled repeating notifications, Netdata would not then send CLEAR notifications for the selected alarms. The release also includes a significant improvement to Netdata's auto-detection capabilities, especially after a system restart. Netdata now remembers which `python.d` plugin jobs were successfully collecting data the last time it was running, and retries to run those jobs for 5 minutes before giving up. As a result, you no longer have to worry if your system starts Netdata before the monitored services have had a chance to start properly. We will complete the same improvement for `go.d` plugins in v1.18.0. We also made some improvements to our binary packages and added a [neat sample custom dashboard](https://docs.netdata.cloud/web/gui/custom/#dash-multi-host-dashboard) that can show charts from multiple Netdata agents. ## Acknowledgements Our thanks go to: - [tnyeanderson](https://github.com/tnyeanderson) for `Dash.html`, the custom dashboard that can show charts from multiple hosts. - [qingkunl](https://github.com/qingkunl) for improving the charts auto-scaling feature with nanosec and num units. - [Fohdeesha](https://github.com/Fohdeesha) for documentation improvements - [Saruspete](https://github.com/Saruspete) for improving debugging capabilities with tags for threads and his significant involvement in many other issues ## Improvements ### Binary packages - netdata/packaging: Trigger stable package generation upon release process [\#6766](https://github.com/netdata/netdata/pull/6766) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - netdata/packaging: Fix ubuntu/xenial runtime dependencies [\#6825](https://github.com/netdata/netdata/pull/6825) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - netdata/packaging: Remove fedora/28, which is no longer available [\#6808](https://github.com/netdata/netdata/pull/6808) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - netdata/packaging: Override control file for debian/buster [\#6777](https://github.com/netdata/netdata/pull/6777) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) ### GUI - Expand dashboard auto-scaling and convertible units. Added two more units that allow auto-scaling and conversion: nanoseconds and num. [\#5920](https://github.com/netdata/netdata/pull/5920) ([qingkunl](https://github.com/qingkunl)) ### Collector improvements - Auto-detect previously running python.d jobs and retry for 5 minutes [\#6661](https://github.com/netdata/netdata/pull/6661) ([ilyam8](https://github.com/ilyam8)) ### Documentation - Fix pfsense instructions and links [\#6768](https://github.com/netdata/netdata/pull/6768) ([Fohdeesha](https://github.com/Fohdeesha)) - Add high level explanation of dashboard contents [\#6648](https://github.com/netdata/netdata/pull/6648) ([joelhans](https://github.com/joelhans)) ### Other - Update cache hashes for js and css [\#6756](https://github.com/netdata/netdata/pull/6756) ([jacekkolasa](https://github.com/jacekkolasa)) - Provide a tag to identify the thread in the error messages. [\#6745](https://github.com/netdata/netdata/pull/6745) ([Saruspete](https://github.com/Saruspete)) - Add sample multi-server dashboard `dash.html` [\#6603](https://github.com/netdata/netdata/pull/6603) ([tnyeanderson](https://github.com/tnyeanderson)) - Replace hard-coded HTTP response codes [\#6595](https://github.com/netdata/netdata/pull/6595) ([thiagoftsm](https://github.com/thiagoftsm)) ## Bug fixes - Fix clear notifications for repeating alarms [\#6638](https://github.com/netdata/netdata/pull/6638) ([thiagoftsm](https://github.com/thiagoftsm)) - Stop `configure.ac` from linking against dbengine and https libraries when dbengine or https are disabled [\#6658](https://github.com/netdata/netdata/pull/6658) ([mfundul](https://github.com/mfundul)) 2019-09-12T16:46:03+00:00 netdata v1.18.0 netdata v1.18.0 2019-10-10T14:16:24+00:00 # Netdata v1.18.0 Release v1.18.0 contains 5 new collectors, 19 bug fixes, 28 improvements, and 20 documentation updates. ## At a glance The **database engine** is now the default method of storing metrics in Netdata. You immediately get more efficient and configurable long-term metrics storage without any work on your part. By saving recent metrics in RAM and "spilling" historical metrics to disk for long-term storage, the database engine is laying the foundation for many more improvements to distributed metrics. We even have a [tutorial](https://docs.netdata.cloud/docs/tutorials/longer-metrics-storage/) on switching to the database engine and getting the most from it. Or, just read up on [how performant](https://docs.netdata.cloud/database/engine/#evaluation) the database engine really is. Both our `python.d` and `go.d` plugins now have more **intelligent auto-detection** by periodically dump a list of active modules to disk. When Netdata starts, such as after a reboot, the plugins use this list of known services to re-establish metrics collection much more reliably. No more worrying if the service or application you need to monitor starts up minutes after Netdata. Two of our new collectors will help those with Hadoop big data infrastructures. The **HDFS and Zookeeper collection modules** come with essential alarms requested by our community and Netdata's auto-detection capabilities to keep the required configuration to an absolute minimum. Read up on the process via our [HDFS and Zookeeper tutorial](https://docs.netdata.cloud/docs/tutorials/monitor-hadoop-cluster/). Speaking of new collectors—we also added the ability to collect metrics from SLAB cache, Gearman, and vCenter Server Appliances. Before v1.18, if you wanted to create alarms for each dimension in a single chart, you need to write separate entities for each dimension—not very efficient or user-friendly. New **dimension templates** fix that hassle. Now, a single entity can automatically generate alarms for any number of dimensions in a chart, even those you weren't aware of! Our [tutorial on dimension templates](https://docs.netdata.cloud/docs/tutorials/dimension-templates/) has all the details. v1.18 brings support for installing Netdata on offline or air-gapped systems. To help users comply with strict security policies, our installation scripts can now install Netdata using previously-downloaded tarball and checksums instead of downloading them at runtime. We have guides for installing offline via `kickstart.sh` or `kickstart-static64.sh` in our [installation documentation](https://docs.netdata.cloud/packaging/installer/#offline-installations). We're excited to bring real-time monitoring to once-inaccessible systems! ## Acknowledgements Our thanks go to: - [Saruspete](https://github.com/Saruspete) for several contributions, including the new `slabinfo` collector, that monitors [SLAB cache mechanism](https://docs.netdata.cloud/collectors/slabinfo.plugin/) metrics. - [agronick](https://github.com/agronick) for the new [Gearman worker statistics](https://docs.netdata.cloud/collectors/python.d.plugin/gearman/) collector - [OneCodeMonkey](https://github.com/OneCodeMonkey) for a bug fix in the alarm notification script. - [lets00](https://github.com/lets00) for providing a Portuguese \(Brazil\) translation of the installation instructions - [mbarper](https://github.com/mbarper) and [davent](https://github.com/davent) for improvements to the uninstaller. - [n0coast](https://github.com/n0coast) for a documentation fix. ## Improvements ### Database engine - Make dbengine the default memory mode [\#6977](https://github.com/netdata/netdata/pull/6977) ([mfundul](https://github.com/mfundul)) - Increase dbengine default cache size [\#6997](https://github.com/netdata/netdata/pull/6997) ([mfundul](https://github.com/mfundul)) - Reduce overhead during write IO [\#6964](https://github.com/netdata/netdata/pull/6964) ([mfundul](https://github.com/mfundul)) - Detect deadlock in dbengine page cache [\#6911](https://github.com/netdata/netdata/pull/6911) ([mfundul](https://github.com/mfundul)) - Remove hard cap from page cache size to eliminate deadlocks. [\#7006](https://github.com/netdata/netdata/pull/7006) ([mfundul](https://github.com/mfundul)) ### New Collectors - [SLAB cache mechanism](https://docs.netdata.cloud/collectors/slabinfo.plugin/) ([Saruspete](https://github.com/Saruspete)) - [Gearman worker statistics](https://docs.netdata.cloud/collectors/python.d.plugin/gearman/) - [vCenter Server Appliance](https://docs.netdata.cloud/collectors/go.d.plugin/modules/vcsa/) - [Zookeeper servers](https://docs.netdata.cloud/collectors/go.d.plugin/modules/zookeeper/) - [Hadoop Distributed File System (HDFS) nodes] (https://docs.netdata.cloud/collectors/go.d.plugin/modules/hdfs/) ### Collector improvements - rabbitmq: Add vhosts message metrics from `/api/vhosts` [\#6976](https://github.com/netdata/netdata/pull/6976) ([ilyam8](https://github.com/ilyam8)) - elasticsearch: collect metrics from \_cat/indices [\#6965](https://github.com/netdata/netdata/pull/6965) ([ilyam8](https://github.com/ilyam8)) - mysql: collect galera cluster metrics [\#6962](https://github.com/netdata/netdata/pull/6962) ([ilyam8](https://github.com/ilyam8)) - Allow configuration of the python.d launch command from netdata.conf [\#6781](https://github.com/netdata/netdata/pull/6781) ([amoss](https://github.com/amoss)) - [x509check](https://github.com/netdata/go.d.plugin/tree/master/modules/x509check): smtp cert check support (https://github.com/netdata/go.d.plugin/pull/261) - [dnsmasq_dhcp](https://github.com/netdata/go.d.plugin/tree/master/modules/dnsmasq_dhcp): respect conf-dir,conf-file,dhcp-host options (https://github.com/netdata/go.d.plugin/pull/268) - plugin: respect previously running jobs after plugin restart (https://github.com/netdata/netdata/issues/6499) - [httpcheck](https://github.com/netdata/go.d.plugin/tree/master/modules/httpcheck): add current state duration chart (https://github.com/netdata/go.d.plugin/pull/270 ) - [springboot2](https://github.com/netdata/go.d.plugin/tree/master/modules/springboot2): fix context (https://github.com/netdata/go.d.plugin/pull/263) ### Health - Enable alarm templates for chart dimensions [\#6560](https://github.com/netdata/netdata/pull/6560) ([thiagoftsm](https://github.com/thiagoftsm)) - Center the chart on the proper chart and time whenever an alarm link is clicked [\#6391](https://github.com/netdata/netdata/pull/6391) ([thiagoftsm](https://github.com/thiagoftsm)) ### Installation/Packages - netdata/installer: Add support for offline installations using `kickstart.sh` or `kickstart-static64.sh` [\#6693](https://github.com/netdata/netdata/pull/6693) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Allow netdata service installation, when docker runs systemd [\#6987](https://github.com/netdata/netdata/pull/6987) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Make spec file more consistent with version dependencies [\#6948](https://github.com/netdata/netdata/pull/6948) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fix broken links on web files, for DEB [\#6930](https://github.com/netdata/netdata/pull/6930) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Introduce separate CUPS package for DEB [\#6724](https://github.com/netdata/netdata/pull/6724) and RPM [\#6700](https://github.com/netdata/netdata/pull/6700) distributions. ([paulkatsoulakis](https://github.com/paulkatsoulakis)). Do not build CUPS plugin subpackage on CentOS 6 and CentOS 7 [\#6926](https://github.com/netdata/netdata/pull/6926) ([knatsakis](https://github.com/knatsakis)) - Various Improvements in the package release CI/CD flow [\#6914](https://github.com/netdata/netdata/pull/6914) [\#6905](https://github.com/netdata/netdata/pull/6905) [\#6842](https://github.com/netdata/netdata/pull/6842) [\#6837](https://github.com/netdata/netdata/pull/6837) [\#6838](https://github.com/netdata/netdata/pull/6838) [\#6834](https://github.com/netdata/netdata/pull/6834) ([paulkatsoulakis](https://github.com/paulkatsoulakis)), [\#6900](https://github.com/netdata/netdata/pull/6900) ([cakrit](https://github.com/cakrit)) - Remove RHEL7 - i386 binary distribution, until bug \#6849 is resolved [\#6902](https://github.com/netdata/netdata/pull/6902) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Bring on board two scripts that build `libuv` and `judy` from source [\#6850](https://github.com/netdata/netdata/pull/6850) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) ### Documentation - Add Portuguese \(Brazil\) translation of the installation instructions [\#16](https://github.com/netdata/localization/pull/16)([lets00](https://github.com/lets00)), [\#7004](https://github.com/netdata/netdata/pull/7004) ([cakrit](https://github.com/cakrit)) - Fix broken links found via linkchecker [\#6983](https://github.com/netdata/netdata/pull/6983) ([joelhans](https://github.com/joelhans)) - Clarification on configuring notification recipients [\#6961](https://github.com/netdata/netdata/pull/6961) ([cakrit](https://github.com/cakrit)) - Fix Remark Lint for READMEs in database [\#6942](https://github.com/netdata/netdata/pull/6942), contrib [\#6921](https://github.com/netdata/netdata/pull/6921), daemon README [\#6920](https://github.com/netdata/netdata/pull/6920) and backends [\#6917](https://github.com/netdata/netdata/pull/6917) ([prhomhyse](https://github.com/prhomhyse)) - Suggest using /run or /var/run for the unix socket [\#6916](https://github.com/netdata/netdata/pull/6916) ([cakrit](https://github.com/cakrit)) - Improve documentation for the SNMP collector [\#6915](https://github.com/netdata/netdata/pull/6915) ([cakrit](https://github.com/cakrit)) - Update docs for offline install [\#6884](https://github.com/netdata/netdata/pull/6884) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Remove Dollar sign from Bash code in documentation and fix remark-lint warnings [\#6880](https://github.com/netdata/netdata/pull/6880) ([prhomhyse](https://github.com/prhomhyse)) - Markdown syntax fixes for MDX parser [\#6877](https://github.com/netdata/netdata/pull/6877) ([joelhans](https://github.com/joelhans)) - Update python.d module checklist to match the current paths and build system. [\#6874](https://github.com/netdata/netdata/pull/6874) ([Ferroin](https://github.com/Ferroin)) - Add instructions for simple SMTP transport [\#6870](https://github.com/netdata/netdata/pull/6870) ([cakrit](https://github.com/cakrit)) - Add example for prometheus archiving source parameter [\#6869](https://github.com/netdata/netdata/pull/6869) ([cakrit](https://github.com/cakrit)) - Fix broken links in the standard web dashboard doc [\#6854](https://github.com/netdata/netdata/pull/6854) ([prhomhyse](https://github.com/prhomhyse)) - Overhaul of Getting started guide [\#6811](https://github.com/netdata/netdata/pull/6811) ([joelhans](https://github.com/joelhans)) - NPM Packages version update [\#6801](https://github.com/netdata/netdata/pull/6801) ([prhomhyse](https://github.com/prhomhyse)) - Update suggested `grep` command in “high performance netdata” to be more specific [\#6794](https://github.com/netdata/netdata/pull/6794) ([n0coast](https://github.com/n0coast)) ### Other - API: Include `family` into the `allmetrics` JSON response [\#6966](https://github.com/netdata/netdata/pull/6966) ([ilyam8](https://github.com/ilyam8)) - API: Add fixed width option to badges [\#6903](https://github.com/netdata/netdata/pull/6903) ([underhood](https://github.com/underhood)) - Allow hostnames in Access Control Lists [\#6796](https://github.com/netdata/netdata/pull/6796) ([amoss](https://github.com/amoss)) - Functional test improvements for web and alarms tests [\#6783](https://github.com/netdata/netdata/pull/6783) ([thiagoftsm](https://github.com/thiagoftsm)) ## Bug fixes - Fix issue error in alarm notification script, when executed without any arguments [\#7003](https://github.com/netdata/netdata/pull/7003) ([OneCodeMonkey](https://github.com/OneCodeMonkey)) - Fix Coverity warnings [\#6992](https://github.com/netdata/netdata/pull/6992) [\#6970](https://github.com/netdata/netdata/pull/6970) [\#6941](https://github.com/netdata/netdata/pull/6941) [\#6797](https://github.com/netdata/netdata/pull/6797) ([thiagoftsm](https://github.com/thiagoftsm)), [\#6909](https://github.com/netdata/netdata/pull/6909) ([cakrit](https://github.com/cakrit)) - Fix dbengine consistency when a writer modifies a page concurrently with a reader querying its metrics [\#6979](https://github.com/netdata/netdata/pull/6979) ([mfundul](https://github.com/mfundul)) - Fix memory leak on netdata exit [\#6945](https://github.com/netdata/netdata/pull/6945) ([vlvkobal](https://github.com/vlvkobal)) - Fix for missing boundary data points in certain cases [\#6938](https://github.com/netdata/netdata/pull/6938) ([mfundul](https://github.com/mfundul)) - Fix `unhandled exception` log warnings in the `python.d` collector orchestrator `start\_job` [\#6928](https://github.com/netdata/netdata/pull/6928) ([ilyam8](https://github.com/ilyam8)) - Fix CORS errors when accessing the health management API, by permitingt `x-auth-token` in `Access-Control-Allow-Headers` [\#6894](https://github.com/netdata/netdata/pull/6894) ([cakrit](https://github.com/cakrit)) - Fix misleading error log entries `RRDSET: chart name 'XXX' on host 'YYY' already exists`, by changing the log level for chart updates [\#6887](https://github.com/netdata/netdata/pull/6887) ([vlvkobal](https://github.com/vlvkobal)) - Properly resolve all Kubernetes container names [\#6885](https://github.com/netdata/netdata/pull/6885) ([cakrit](https://github.com/cakrit)) - Fix LGTM warnings [\#6875](https://github.com/netdata/netdata/pull/6875) ([jacekkolasa](https://github.com/jacekkolasa)) - Fix agent UI redirect loop during cloud sign-in [\#6868](https://github.com/netdata/netdata/pull/6868) ([jacekkolasa](https://github.com/jacekkolasa)) - Fix `/var/lib/netdata/registry` getting left behind after uninstall [\#6867](https://github.com/netdata/netdata/pull/6867) ([davent](https://github.com/davent)) - Fix python.d.plugin bug in parsing configuration files with no explicitly defined jobs [\#6856](https://github.com/netdata/netdata/pull/6856) ([ilyam8](https://github.com/ilyam8)) - Fix potential buffer overflow in the web server [\#6817](https://github.com/netdata/netdata/pull/6817) ([amoss](https://github.com/amoss)) - Fix netdata group deletion on linux for uninstall script [\#6645](https://github.com/netdata/netdata/pull/6645) ([mbarper](https://github.com/mbarper)) - Various `cppcheck` fixes [\#6386](https://github.com/netdata/netdata/pull/6386) ([ac000](https://github.com/ac000)) - Fix crash on FreeBSD due to do\_dev\_cpu\_temperature stack corruption [\#7014](https://github.com/netdata/netdata/pull/7014) ([samm-git](https://github.com/samm-git)) - Fix handling of illegal metric timestamps in database engine [\#7008](https://github.com/netdata/netdata/pull/7008) ([mfundul](https://github.com/mfundul)) - Fix a resource leak [\#7007](https://github.com/netdata/netdata/pull/7007) ([vlvkobal](https://github.com/vlvkobal)) - Fix rabbitmq collector error when no vhosts are available. [\#7018](https://github.com/netdata/netdata/pull/7018) ([mfundul](https://github.com/ilyam8)) 2019-10-10T14:16:24+00:00 netdata v1.18.1 netdata v1.18.1 2019-10-18T16:17:10+00:00 # Netdata v1.18.1 Release v1.18.1 contains 17 bug fixes, 5 improvements, and 5 documentation updates. ## At a glance Patch release 1.18.1 contains several bug fixes, mainly related to FreeBSD and the binary package generation process. Netdata can now [send notifications to Google Hangouts Chat](https://docs.netdata.cloud/health/notifications/hangouts/)! On certain systems, the `slabinfo` plugin introduced in v1.18.0 added thousands of new metrics. We decided the collector's usefulness to most users didn't justify the increase in resource requirements. This release disables the collector by default. Finally, we added a chart under **Netdata Monitoring** to present a better view of the RAM used by the [database engine (dbengine)](https://docs.netdata.cloud/database/engine/). The chart doesn't currently take into consideration the RAM used for slave nodes, so we intend to add more related charts in the future. ## Acknowledgements We'd like to thank: - [hendrikhofstadt](https://github.com/hendrikhofstadt) for the Google Hangouts notifications - [stevenh](https://github.com/stevenh) for the awesome zombie process reaper and the fix for the freeipmi collector - [samm-git](https://github.com/samm-git) for the addition of the VMware VMXNET3 driver to the default interfaces list for FreeBSD - [sz4bi](https://github.com/sz4bi) for a documentation fix ## Improvements - Disable `slabinfo` plugin by default to reduce the total number of metrics collected [\#7056](https://github.com/netdata/netdata/pull/7056) ([vlvkobal](https://github.com/vlvkobal)) - Add dbengine RAM usage statistics [\#7038](https://github.com/netdata/netdata/pull/7038) ([mfundul](https://github.com/mfundul)) - Support Google Hangouts chat notifications [\#7013](https://github.com/netdata/netdata/pull/7013) ([hendrikhofstadt](https://github.com/hendrikhofstadt)) - Add CMocka unit tests [\#6985](https://github.com/netdata/netdata/pull/6985) ([vlvkobal](https://github.com/vlvkobal)) - Add prerequisites to enable automatic updates for installations via the static binary (`kickstart-static64.sh`) [\#7060](https://github.com/netdata/netdata/pull/7060) ([knatsakis](https://github.com/knatsakis)) ### Documentation - Fix typo in health\_alarm\_notify.conf [\#7062](https://github.com/netdata/netdata/pull/7062) ([sz4bi](https://github.com/sz4bi)) - Fix BSD/pfSense documentation [\#7041](https://github.com/netdata/netdata/pull/7041) ([thiagoftsm](https://github.com/thiagoftsm)) - Document the structure of the `api/v1/data` API responses. [\#7012](https://github.com/netdata/netdata/pull/7012) ([amoss](https://github.com/amoss)) - Tutorials to support v1.18 features [\#6993](https://github.com/netdata/netdata/pull/6993) ([joelhans](https://github.com/joelhans)) - Fix broken links in docs [\#7123](https://github.com/netdata/netdata/pull/7123) ([joelhans](https://github.com/joelhans)) ## Bug fixes - Fix unbound collector timings: Convert recursion timings to milliseconds. [\#7121](https://github.com/netdata/netdata/pull/7121) ([Ferroin](https://github.com/Ferroin)) - Fix unbound collector unhandled exceptions [\#7112](https://github.com/netdata/netdata/pull/7112) ([ilyam8](https://github.com/ilyam8)) - Fix upgrade path from v1.17.1 to v1.18.x for deb packages [\#7118](https://github.com/netdata/netdata/pull/7118) ([knatsakis](https://github.com/knatsakis)) - Fix CPU charts in apps plugin on FreeBSD [\#7115](https://github.com/netdata/netdata/pull/7115) ([vlvkobal](https://github.com/vlvkobal)) - Fix megacli collector binary search and sudo check [\#7108](https://github.com/netdata/netdata/pull/7108) ([ilyam8](https://github.com/ilyam8)) - Fix missing packages, by running the triggers for DEB and RPM package build in separate stages [\#7105](https://github.com/netdata/netdata/pull/7105) ([knatsakis](https://github.com/knatsakis)) - Fix segmentation fault in FreeBSD when statsd is disabled [\#7102](https://github.com/netdata/netdata/pull/7102) ([vlvkobal](https://github.com/vlvkobal)) - Fix Clang warnings [\#7090](https://github.com/netdata/netdata/pull/7090) ([thiagoftsm](https://github.com/thiagoftsm)) - Fix python.d error logging: change chart suppress msg level from ERROR to INFO [\#7085](https://github.com/netdata/netdata/pull/7085) ([ilyam8](https://github.com/ilyam8)) - Fix freeipmi update frequency check: was warning that 5 was too frequent and it was setting it to 5. [\#7078](https://github.com/netdata/netdata/pull/7078) ([stevenh](https://github.com/stevenh)) - Fix alarm configurations not getting loaded, via better handling of chart names with special characters [\#7069](https://github.com/netdata/netdata/pull/7069) ([thiagoftsm](https://github.com/thiagoftsm)) - Fix dbengine not working when `mmap` fails - mostly with BSD kernels [\#7065](https://github.com/netdata/netdata/pull/7065) ([mfundul](https://github.com/mfundul)) - Fix FreeBSD issue due to incorrect size of a zeroed block [\#7061](https://github.com/netdata/netdata/pull/7061) ([vlvkobal](https://github.com/vlvkobal)) - Don't write HTTP response 204 messages to the logs [\#7035](https://github.com/netdata/netdata/pull/7035) ([vlvkobal](https://github.com/vlvkobal)) - Fix build when CMocka isn't installed [\#7129](https://github.com/netdata/netdata/pull/7129) ([vlvkobal](https://github.com/vlvkobal)) - FreeBSD plugin: Add VMware VMXNET3 driver to the default interfaces list [\#7109](https://github.com/netdata/netdata/pull/7109) ([samm-git](https://github.com/samm-git)) - Prevent zombie processes when a child is re-parented to netdata when its running in a container , by adding child process reaper [\#7059](https://github.com/netdata/netdata/pull/7059) ([stevenh](https://github.com/stevenh)) 2019-10-18T16:17:10+00:00 netdata v1.19.0 netdata v1.19.0 2019-11-28T01:41:38+00:00 # Netdata v1.19.0 Release v1.19.0 contains 2 new collectors, 19 bug fixes, 17 improvements, and 19 documentation updates. ## At a glance We completed a major rewrite of our **web log collector** to dramatically improve its flexibility and performance. The [new collector](https://github.com/netdata/go.d.plugin/pull/141), written entirely in Go, can parse and chart logs from Nginx and Apache servers, and combines numerous improvements. Netdata now supports the LTSV log format, creates charts for TLS and cipher usage, and is amazingly fast. In a test using SSD storage, the collector parsed the logs for 200,000 requests in about 200ms, using 30% of a single core. This Go-based collector also has powerful custom log parsing capabilities, which means we're one step closer to a generic application log parser for Netdata. We're continuing to work on this parser to support more application log formatting in the future. We have a new tutorial on [enabling the Go web log collector](https://docs.netdata.cloud/docs/tutorials/collect-apache-nginx-web-logs/) and using it with Nginx and/or Apache access logs with minimal configuration. Thanks to [Wing924](https://github.com/Wing924) for starting the Go rewrite! We introduced more **cmocka unit testing** to Netdata. In this release, we're testing how Netdata's internal web server processes HTTP requests—the first step to improve the quality of code throughout, reduce bugs, and make refactoring easier. We wanted to validate the web server's behavior but needed to build a layer of parametric testing on top of the CMocka test runner. Read all about our process of testing and selecting cmocka on our blog post: [Building an agile team's 'safety harness' with cmocka and FOSS](https://blog.netdata.cloud/agile-team-cmocka-foss/). Netdata's **Unbound collector** was also [completely rewritten in Go](https://github.com/netdata/go.d.plugin/pull/287) to improve how it collects and displays metrics. This new version can get dozens of metrics, including details on queries, cache, uptime, and even show per-thread metrics. See our [tutorial](https://docs.netdata.cloud/docs/tutorials/collect-unbound-metrics/) on enabling the new collector via Netdata's amazing auto-detection feature. We [fixed an error](https://github.com/netdata/netdata/pull/7220) where **invalid spikes** appeared on certain charts by improving the incremental counter reset/wraparound detection algorithm. Netdata can now send [**health alarm notifications to IRC channels**](https://docs.netdata.cloud/health/notifications/irc/) thanks to [Strykar](https://github.com/Strykar)! And, Netdata can now monitor [**AM2320 sensors**](https://docs.netdata.cloud/collectors/python.d.plugin/am2320/), thanks to hard work from [Tom Buck](https://github.com/tommybuck). ## Acknowledgements Our thanks go to: - [andyundso](https://github.com/andyundso) for fixing the packagecloud binary installation in Debian 8. - [Strykar](https://github.com/Strykar) for adding support IRC health notifications. - [tommybuck](https://github.com/tommybuck) for the new [AM2320 sensors](https://docs.netdata.cloud/collectors/python.d.plugin/am2320/) collector. - [Saruspete](https://github.com/Saruspete) for the new ability to provide metrics on fragmentation of free memory pages. - [OdysLam](https://github.com/OdysLam) for improving the documentation for new collector plugins. - [k0ste](https://github.com/k0ste), [xginn8](https://github.com/xginn8) and [nodiscc](https://github.com/nodiscc) for improving the configuration of the apps plugin. - [amichelic](https://github.com/amichelic) for improving the web\_log collector. - [cherouvim](https://github.com/cherouvim), [arkamar](https://github.com/arkamar), [half-duplex](https://github.com/half-duplex) and [CtrlAltDel64](https://github.com/CtrlAltDel64) for improving the documentation. - [mniestroj](https://github.com/mniestroj) for the fix to the dbengine compilation with musl standard C. - [arkamar](https://github.com/arkamar) for an improvement to the xenstat collector. - [vakartel](https://github.com/vakartel) for improving the cgroup network interfaces detection in Proxmox 6. ## Improvements ### New Collectors - AM2320 sensor collector plugin [\#7024](https://github.com/netdata/netdata/pull/7024) ([tommybuck](https://github.com/tommybuck)) - Added parsing of /proc/pagetypeinfo to provide metrics on fragmentation of free memory pages. [\#6843](https://github.com/netdata/netdata/pull/6843) ([Saruspete](https://github.com/Saruspete)) - The unbound collector module was completely rewritten, in Go [go.d.plugin/\#287](https://github.com/netdata/go.d.plugin/pull/287) ([ilyam8](https://github.com/ilyam8)) ### Collector improvements - We rewrote our web log parser in Go, drastically improving its flexibility and performance. [go.d.plugin/\#141](https://github.com/netdata/go.d.plugin/pull/141) ([ilyam8](https://github.com/ilyam8)) - The [Kubernetes kubelet collector](https://docs.netdata.cloud/collectors/go.d.plugin/modules/k8s_kubelet/) now reads the service account token and uses it for authorization. We also added a new default job to collect metrics from `https://localhost:10250/metrics`. [go.d.plugin/\#285](https://github.com/netdata/go.d.plugin/pull/285) - Added a new default job to the [Kubernetes coredns](https://docs.netdata.cloud/collectors/go.d.plugin/modules/coredns/) collector to collect metrics from `http://kube-dns.kube-system.svc.cluster.local:9153/metrics`. [go.d.plugin/\#285](https://github.com/netdata/go.d.plugin/pull/285) - apps.plugin: Synced FRRouting daemons configuration with the frr 7.2 release. [\#7333](https://github.com/netdata/netdata/pull/7333) ([k0ste](https://github.com/k0ste)) - apps.plugin: Added process group for git-related processes. [\#7289](https://github.com/netdata/netdata/pull/7289) ([nodiscc](https://github.com/nodiscc)) -apps.plugin: Added balena to the container-engines application group. [\#7287](https://github.com/netdata/netdata/pull/7287) ([xginn8](https://github.com/xginn8)) - web\_log: Treat 401 Unauthorized requests as successful. [\#7256](https://github.com/netdata/netdata/pull/7256) ([amichelic](https://github.com/amichelic)) - xenstat.plugin: Prepare for xen 4.13 by checking for `check xenstat_vbd_error` presence. [\#7103](https://github.com/netdata/netdata/pull/7103) ([arkamar](https://github.com/arkamar)) - mysql: Added galera `cluster_status` alarm. [\#6989](https://github.com/netdata/netdata/pull/6989) ([ilyam8](https://github.com/ilyam8)) ### Metrics Database - Netdata generates alarms if the disk cannot keep up with data collection. [\#7139](https://github.com/netdata/netdata/pull/7139) ([mfundul](https://github.com/mfundul)) ### Health - Fine tune various default alarm configurations. [\#7322](https://github.com/netdata/netdata/pull/7322) ([Ferroin](https://github.com/Ferroin)) - Update SYN cookie alarm to be less aggressive. [\#7250](https://github.com/netdata/netdata/pull/7250) ([Ferroin](https://github.com/Ferroin)) - Added support for IRC alarm notifications [\#7148](https://github.com/netdata/netdata/pull/7148) ([Strykar](https://github.com/Strykar)) ### Installation/Packages - Corrected the Makefile.am files indentation, to prevent unexpected errors. [\#7252](https://github.com/netdata/netdata/pull/7252) ([knatsakis](https://github.com/knatsakis)) - Rationalized ownership and permissions of `/etc/netdata`. [\#7244](https://github.com/netdata/netdata/pull/7244) ([knatsakis](https://github.com/knatsakis)) - Made various improvements to the installer script `netdata-installer.sh`. [\#7200](https://github.com/netdata/netdata/pull/7200) ([knatsakis](https://github.com/knatsakis)) - Include go.d.plugin version v0.11.0 [\#7365](https://github.com/netdata/netdata/pull/7365) ([ilyam8](https://github.com/ilyam8)) ### Documentation - Correct versions of FreeNAS that Netdata is available on. [\#7355](https://github.com/netdata/netdata/pull/7355) ([knatsakis](https://github.com/knatsakis)) - Update plugins.d/README.md. [\#7335](https://github.com/netdata/netdata/pull/7335) ([OdysLam](https://github.com/OdysLam)) - Note regarding stable vs nightly was accidentally being shown as a code fragment in the installation documentation. [\#7330](https://github.com/netdata/netdata/pull/7330) ([cakrit](https://github.com/cakrit)) - Properly link to translated documents from netdata-security.md. [\#7343](https://github.com/netdata/netdata/pull/7343) ([cakrit](https://github.com/cakrit)) - Update documentation of the netdata-updater, to properly cover `kickstart-static64.sh` and `kickstart.sh` installations. [\#7262](https://github.com/netdata/netdata/pull/7262) ([knatsakis](https://github.com/knatsakis)) - Converted the swagger documentation to OpenAPI3.0. [\#7257](https://github.com/netdata/netdata/pull/7257) ([amoss](https://github.com/amoss)) - Minor corrections to the netdata installer documentation. [\#7246](https://github.com/netdata/netdata/pull/7246) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fix typo in collectors README. [\#7242](https://github.com/netdata/netdata/pull/7242) ([cherouvim](https://github.com/cherouvim)) - Clarified database engine/RAM in getting started guide. [\#7225](https://github.com/netdata/netdata/pull/7225) ([joelhans](https://github.com/joelhans)) - Suggest using `/var/run/netdata` for the unix socket, in running behind nginx documentation. [\#7206](https://github.com/netdata/netdata/pull/7206) ([CtrlAltDel64](https://github.com/CtrlAltDel64)) - Added GA links to new documents. [\#7194](https://github.com/netdata/netdata/pull/7194) ([joelhans](https://github.com/joelhans)) - Added a page for metrics archiving to TimescaleDB. [\#7180](https://github.com/netdata/netdata/pull/7180) ([joelhans](https://github.com/joelhans)) - Fixed typo in the `contrib/debian` descriptions for `cupsd`. [\#7154](https://github.com/netdata/netdata/pull/7154) ([arkamar](https://github.com/arkamar)) - Added user information to MySQL Python module documentation. [\#7128](https://github.com/netdata/netdata/pull/7128) ([prhomhyse](https://github.com/prhomhyse)) - [Document the results](https://docs.netdata.cloud/build/) of the spike investigation into CMake. [\#7114](https://github.com/netdata/netdata/pull/7114) ([amoss](https://github.com/amoss)) - Fix to docker-compose+Caddy installation. [\#7088](https://github.com/netdata/netdata/pull/7088) ([joelhans](https://github.com/joelhans)) - Fixed broken links and added setup instructions for Telegram health notifications. [\#7033](https://github.com/netdata/netdata/pull/7033) ([half-duplex](https://github.com/half-duplex)) - Minor grammar change in /web/gui documentation [\#7363](https://github.com/netdata/netdata/pull/7363) ([eviemsrs](https://github.com/eviemsrs)) ### Other - Improve Travis build warnings \(issue #7189\). [\#7312](https://github.com/netdata/netdata/pull/7312) ([amoss](https://github.com/amoss)) - cmocka testing for http requests #7308, [\#7308](https://github.com/netdata/netdata/pull/7308), [\#7264](https://github.com/netdata/netdata/pull/7264) [\#7210](https://github.com/netdata/netdata/pull/7210) ([amoss](https://github.com/amoss) and [vlvkobal](https://github.com/vlvkobal)) - CI/CD: Prevented nightly jobs from timing out [\#7238](https://github.com/netdata/netdata/pull/7238), [\#7214](https://github.com/netdata/netdata/pull/7214) ([knatsakis](https://github.com/knatsakis)) ## Bug fixes - Fixed packagecloud binary installation in Debian 8. [\#7342](https://github.com/netdata/netdata/pull/7342) ([andyundso](https://github.com/andyundso)) - Fixed missing libraries in certain compilations, by adding missing trailing backslash to `Makefile.am`. [\#7326](https://github.com/netdata/netdata/pull/7326) ([oxplot](https://github.com/oxplot)) - Prevented freezes due to isolated CPUs. [\#7318](https://github.com/netdata/netdata/pull/7318) ([stelfrag](https://github.com/stelfrag)) - Fixed missing streaming when slave has SSL activated. [\#7306](https://github.com/netdata/netdata/pull/7306) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed error 421 in IRC notifications, by removing a line break from the message. [\#7243](https://github.com/netdata/netdata/pull/7243) ([thiagoftsm](https://github.com/thiagoftsm)) - `proc/pagetypeinfo` collection could under particular circumstances cause high CPU load. As a workaround, we disabled `pagetypeinfo` by default. [\#7230](https://github.com/netdata/netdata/pull/7230) ([vlvkobal](https://github.com/vlvkobal)) - Fixed incorrect memory allocation in `proc` plugin’s `pagetypeinfo` collector. [\#7187](https://github.com/netdata/netdata/pull/7187) ([thiagoftsm](https://github.com/thiagoftsm)) - Eliminated cached responses from the postgres collector. [\#7228](https://github.com/netdata/netdata/pull/7228) ([ilyam8](https://github.com/ilyam8)) - rabbitmq: Fixed `"disk_free": "disk_free_monitoring_disabled"` error. [\#7226](https://github.com/netdata/netdata/pull/7226) ([ilyam8](https://github.com/ilyam8)) - Fixed build with musl standard C library by including `limits.h` before using `LONG_MAX`. [\#7224](https://github.com/netdata/netdata/pull/7224) ([mniestroj](https://github.com/mniestroj)) - Fixed Apache module not working with letsencrypt certificate by allowing the python `UrlService` to skip `tls_verify` for http scheme. [\#7223](https://github.com/netdata/netdata/pull/7223) ([ilyam8](https://github.com/ilyam8)) - Fixed invalid spikes appearing in certain charts, by improving the incremental counter reset/wraparound detection algorithm. [\#7220](https://github.com/netdata/netdata/pull/7220) ([mfundul](https://github.com/mfundul)) - Fixed DNS-lookup performance issue on FreeBSD. [\#7132](https://github.com/netdata/netdata/pull/7132) ([amoss](https://github.com/amoss)) - Fixed handling of the `stable` option, so that the installers and automatic updater respect it. [\#7083](https://github.com/netdata/netdata/pull/7083) ([knatsakis](https://github.com/knatsakis)), [\#7051](https://github.com/netdata/netdata/pull/7051) ([oxplot](https://github.com/oxplot)) - Fixed handling of the static binary installer’s handling of the `--auto-update` option. [\#7076](https://github.com/netdata/netdata/pull/7076) ([knatsakis](https://github.com/knatsakis)) - Fixed cgroup network interfaces classification on Proxmox 6. [\#7037](https://github.com/netdata/netdata/pull/7037) ([vakartel](https://github.com/vakartel)) - Added missing dbengine flags to the installer. [\#7027](https://github.com/netdata/netdata/pull/7027) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Fixed issue with unknown variables in alarm configuration expressions always being evaluated to zero. [\#6984](https://github.com/netdata/netdata/pull/6984) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed issue of automatically picking up Pi-hole stats from a Pi-hole instance installed on another device by disabling the default job that collects metrics from `http://pi.hole`. [go.d.plugin 289](https://github.com/netdata/go.d.plugin/pull/289) ([ilyam8](https://github.com/ilyam8)) 2019-11-28T01:41:38+00:00 netdata v1.20.0 netdata v1.20.0 2020-02-21T03:43:30+00:00 # Netdata v1.20.0 Release v1.20.0 contains 3 new collectors, 54 bug fixes, 89 improvements, and 38 documentation updates. ## At a glance Our first major release of 2020 comes with an alpha version of our new **eBPF collector**. eBPF ([extended Berkeley Packet Filter](https://lwn.net/Articles/740157/)) is a virtual bytecode machine, built directly into the Linux kernel, that you can use for advanced monitoring and tracing. With this release, the eBPF collector monitors system calls inside your kernel to help you understand and visualize the behavior of your file descriptors, virtual file system (VFS) actions, and process/thread interactions. You can already use it for debugging applications and better understanding how the Linux kernel handles I/O and process management. The eBPF collector is in a technical preview, and doesn't come enabled out of the box. If you'd like to learn more about_why_ eBPF metrics are such an important addition to Netdata, see our blog post: [_Linux eBPF monitoring with Netdata_](https://blog.netdata.cloud/posts/linux-ebpf-monitoring-netdata/). When you're ready to get started, enable the eBPF collector by following the steps in our [documentation](https://docs.netdata.cloud/collectors/ebpf_process.plugin/). This release also introduces **host labels**, a powerful new way of organizing your Netdata-monitored systems. Netdata automatically creates a handful of labels for essential information, but you can supplement the defaults by segmenting your systems based on their location, purpose, operating system, or even when they went live. You can use host labels to create alarms that apply only to systems with specific labels, or apply labels to metrics you archive to other databases with our exporting engine. Because labels are streamed from slave to master systems, you can now find critical information about your entire infrastructure directly from the master system. Our [host labels tutorial](https://docs.netdata.cloud/docs/tutorials/using-host-labels/) will walk you through creating your first host labels and putting them to use in Netdata's other features. Finally, we introduced a new **CockroachDB collector**. Because we use CockroachDB internally, we wanted a better way of keeping tabs on the health and performance of our databases. Given how popular CockroachDB is right now, we know we're not alone, and are excited to share this collector with our community. See our [tutorial on monitoring CockroachDB metrics](https://docs.netdata.cloud/docs/tutorials/monitor-cockroachdb/) for set-up details. We also added a new [**squid access log collector**](https://docs.netdata.cloud/collectors/go.d.plugin/modules/squidlog/#squid-logs-monitoring-with-netdata) that parses and visualizes requests, bandwidth, responses, and much more. Our [**apps.plugin collector**](https://docs.netdata.cloud/collectors/apps.plugin/) has new and improved way of processing groups together, and our [**cgroups collector**](https://docs.netdata.cloud/collectors/cgroups.plugin/) is better at LXC (Linux container) monitoring. Speaking of collectors, we **revamped our [collectors documentation](https://docs.netdata.cloud/collectors/)** to simplify how users learn about metrics collection. You can now view a [collectors quickstart](https://docs.netdata.cloud/collectors/quickstart/) to learn the process of enabling collectors and monitoring more applications and services with Netdata, and see everything Netdata collects in our [supported collectors list](https://docs.netdata.cloud/collectors/collectors/). ## Acknowledgements We're extremely grateful to the following contributors for their help since our last major release in November 2019. Whether it's their first or fiftieth contribution, insights from our users not only help make Netdata better, but also remind us why we're so lucky to be part of a vibrant open-source community. - [k0ste](https://github.com/k0ste) and [DefauIt](https://github.com/DefauIt) for improving the application groups of the apps plugin. - [gmeszaros](https://github.com/gmeszaros) for a fix to the broken updater. - [blaines](https://github.com/blaines) for an `elastisearch` collector fix. - [stevenh](https://github.com/stevenh) for adding `freeipmi` support to our Docker image and [lassebm](https://github.com/lassebm) for related fixes and documentation. - [yasharne](https://github.com/yasharne) for helping us improve the `httpcheck` collector. - [candrews](https://github.com/candrews) for the introduction of `-fno-common` in CFLAGS. - [Jiab77](https://github.com/Jiab77) for fixing a typo in the installer options. - [amishmm](https://github.com/amishmm) for improvements to the `systemd` service files. - [tnyeanderson](https://github.com/tnyeanderson) for continuing to improve his multi-host sample dashboard. - [yasharne](https://github.com/yasharne) and especially [schneiderl](https://github.com/schneiderl) for corrections to the docs. - [lucasRolff](https://github.com/lucasRolff) for improvements to the `litespeed` collector. - [Ehekatl](https://github.com/Ehekatl) for the improvements to the Prometheus remote write API and the fix to the`softnet` alarm. - [wonsangki](https://github.com/wonsangki) for translating several docs into Korean. - [candrews](https://github.com/candrews) for fixing the option to disable the Prometheus remote API from `configure`. - [kkoomen](https://github.com/kkoomen) for improvements to the Apache proxy guide. - [vzDevelopment](https://github.com/vzDevelopment) for assistance with the unicode support in the python.d plugin. - [hexchain](https://github.com/hexchain) for the addition of pressure stall information to the proc plugin. - [nabijaczleweli](https://github.com/nabijaczleweli) and [rex4539](https://github.com/rex4539) for documentation fixes. ## Breaking Changes - Removed deprecated `bash` collectors `apache`, `cpu_apps`, `cpufreq`, `exim`, `hddtemp`, `load_average`, `mem_apps`, `mysql`, `nginx`, `phpfpm`, `postfix`, `squid`, `tomcat` [\#7962](https://github.com/netdata/netdata/pull/7962) ([ilyam8](https://github.com/ilyam8)). If you were still using one of these collectors with custom configurations, you can find the new collector that replaces it in the [supported collectors list](https://docs.netdata.cloud/collectors/collectors/). - Modified the Netdata updater to prevent unecessary updates right after installation and to avoid updates via local tarballs [\#7939](https://github.com/netdata/netdata/pull/7939) ([prologic](https://github.com/prologic)). These changes introduced a critical bug to the updater, which was fixed via [\#8057](https://github.com/netdata/netdata/pull/8057) [\#8076](https://github.com/netdata/netdata/pull/8076) ([prologic](https://github.com/prologic)) and [\#8028](https://github.com/netdata/netdata/pull/8028) ([gmeszaros](https://github.com/gmeszaros)). **See [issue 8056](https://github.com/netdata/netdata/issues/8056) if your Netdata is stuck on v1.19.0-432**. ## Improvements ### Host Labels - Added support for host labels [#7515](https://github.com/netdata/netdata/pull/7515) [#7449](https://github.com/netdata/netdata/pull/7449) ([amoss](https://github.com/amoss)) - Improved the monitored system information detection. Added CPU freq & cores, RAM and disk space. [\#7815](https://github.com/netdata/netdata/pull/7815) [\#7866](https://github.com/netdata/netdata/pull/7866) ([Ferroin](https://github.com/Ferroin)), [\#7862](https://github.com/netdata/netdata/pull/7862) ([thiagoftsm](https://github.com/thiagoftsm)) - Started distinguishing the monitored system's (host) OS/Kernel etc. from those of the docker container's [\#7770](https://github.com/netdata/netdata/pull/7770) ([amoss](https://github.com/amoss)) - Started creating host labels from collected system info [#7485](https://github.com/netdata/netdata/pull/7485) ([vlvkobal](https://github.com/vlvkobal)) - Started passing labels and container environment variables via the streaming protocol[\#7549](https://github.com/netdata/netdata/pull/7549) [\#8011](https://github.com/netdata/netdata/pull/8011) ([thiagoftsm](https://github.com/thiagoftsm)) - Started sending host labels via exporting connectors [\#7554](https://github.com/netdata/netdata/pull/7554) [\#7702](https://github.com/netdata/netdata/pull/7702)([vlvkobal](https://github.com/vlvkobal)) - Added label support to alarm definitions and started recording them in alarm logs [\#7548](https://github.com/netdata/netdata/pull/7548) [\#7594](https://github.com/netdata/netdata/pull/7594)[#7462](https://github.com/netdata/netdata/pull/7462) [#7600](https://github.com/netdata/netdata/pull/7600) ([thiagoftsm](https://github.com/thiagoftsm)) - Added support for host labels to the API responses [#7493](https://github.com/netdata/netdata/pull/7493) [#7616](https://github.com/netdata/netdata/pull/7616) ([vlvkobal](https://github.com/vlvkobal)) - Added configurable host labels to `netdata.conf` [#7451](https://github.com/netdata/netdata/pull/7451) [#7458](https://github.com/netdata/netdata/pull/7458) ([thiagoftsm](https://github.com/thiagoftsm)) - Added kubernetes labels [#7510](https://github.com/netdata/netdata/pull/7510) [#7453](https://github.com/netdata/netdata/pull/7453) ([cakrit](https://github.com/cakrit)) ### New Collectors - eBPF kernel collector [#7979](https://github.com/netdata/netdata/pull/7979) ([thiagoftsm](https://github.com/thiagoftsm)) [#8075](https://github.com/netdata/netdata/pull/8075) ([prologic](https://github.com/prologic)) - CockroachDB (go.d.plugin #322) - squidlog: squid access log parser (go.d.plugin #304) ### Collector improvements - apps.plugin - Created `dns` group. [\#8058](https://github.com/netdata/netdata/pull/8058) ([k0ste](https://github.com/k0ste)) - Improved `database` group. [\#8004](https://github.com/netdata/netdata/pull/8004) ([DefauIt](https://github.com/DefauIt)) - Improved `ceph` & `samba` groups. [\#7982](https://github.com/netdata/netdata/pull/7982) ([k0ste](https://github.com/k0ste)) - varnish: Added SMF metrics (cache on disk) [\#7926](https://github.com/netdata/netdata/pull/7926) ([ilyam8](https://github.com/ilyam8)) - phpfpm: Fixed per process chart titles and readme [\#7876](https://github.com/netdata/netdata/pull/7876) ([ilyam8](https://github.com/ilyam8)) - python.d: Formatted the code in all modules [\#7832](https://github.com/netdata/netdata/pull/7832) ([ilyam8](https://github.com/ilyam8)) - node.d/snmp: - Added snmpv3 support [\#7802](https://github.com/netdata/netdata/pull/7802) ([ilyam8](https://github.com/ilyam8)) - Formatted the code in `snmp.node.js` [\#7816](https://github.com/netdata/netdata/pull/7816) ([ilyam8](https://github.com/ilyam8)) - cgroups: Improved LXC monitoring by filtering out irrelevant LXC cgroups [\#7760](https://github.com/netdata/netdata/pull/7760) ([vlvkobal](https://github.com/vlvkobal)) - litespeed: Added support for different `.rtreport` format [\#7705](https://github.com/netdata/netdata/pull/7705) ([lucasRolff](https://github.com/lucasRolff)) - freeipmi: Added support to the docker image [#7081](https://github.com/netdata/netdata/pull/7081) ([stevenh](https://github.com/stevenh)) - proc.plugin: Added pressure stall information [#7209](https://github.com/netdata/netdata/pull/7209) [#7547](https://github.com/netdata/netdata/pull/7547) ([hexchain](https://github.com/hexchain)) - sensors: Improved collection logic [#7447](https://github.com/netdata/netdata/pull/7447) ([ilyam8](https://github.com/ilyam8)) - proc: Started monitoring network interface speed, duplex, operstate [#7395](https://github.com/netdata/netdata/pull/7395) ([stelfrag](https://github.com/stelfrag)) - smartd_log: Fixed the setting in the reallocated sectors count, by setting ATTR5 chart algorithm to absolute [#7384](https://github.com/netdata/netdata/pull/7384) ([ilyam8](https://github.com/ilyam8)) - nvidia-smi: Allow executing `nvidia-smi` in normal instead of loop mode [#7372](https://github.com/netdata/netdata/pull/7372) ([ilyam8](https://github.com/ilyam8)) - wmi: collect logon metrics, collect logical_disk disk latency metrics - weblog: handle MKCOL, PROPFIND, MOVE, SEARCH http request methods - scaleio: storage pools and sdcs metrics. (#294) ### Exporting Engine - Implemented the main flow for the Exporting Engine [#7149](https://github.com/netdata/netdata/pull/7149) ([vlvkobal](https://github.com/vlvkobal)) ### Streaming - Add versioning to the streaming protocol [\#7851](https://github.com/netdata/netdata/pull/7851) ([thiagoftsm](https://github.com/thiagoftsm)) ### Installation/Packages - Fixed missing directory when creating the symbolic link during eBPF installation and remove future options. [\#8133](https://github.com/netdata/netdata/pull/8133) ([prologic](https://github.com/thiago)) - Fixed NetData installer on \*BSD systems after libmosquitto and eBPF functionality was enabled. [\#8121](https://github.com/netdata/netdata/pull/8121) ([prologic](https://github.com/prologic)) - Fixed issues with the RPM nightly builds resulting from the bundled libmosquitto functionality that was recently merged. [\#8109](https://github.com/netdata/netdata/pull/8109) ([Ferroin](https://github.com/Ferroin)) - Corrected the invocations of `mktemp` so that they produce temporary directories in `$TEMPDIR` instead of the current directory, in a way that is compatible with busybox. [\#8066](https://github.com/netdata/netdata/pull/8066) ([Ferroin](https://github.com/Ferroin)) - Improved CI/CD workflow to install required packages and build the agent across all the OS/Distro\(s\) we support [\#7969](https://github.com/netdata/netdata/pull/7969) [\#7949](https://github.com/netdata/netdata/pull/7949) ([prologic](https://github.com/prologic)) - Updated the installer to download `go.d.plugin`, only if we have a new version [\#7946](https://github.com/netdata/netdata/pull/7946) ([ilyam8](https://github.com/ilyam8)) - Assorted cleanup items in the RPM spec file. [\#7927](https://github.com/netdata/netdata/pull/7927) ([Ferroin](https://github.com/Ferroin)) - Added a new, simpler, Alpine based Dockerfile for quick dev and testing [\#7914](https://github.com/netdata/netdata/pull/7914) ([prologic](https://github.com/prologic)) - Added minor fixes and improvements to the installer/updater shell scripts. [\#7847](https://github.com/netdata/netdata/pull/7847) ([prologic](https://github.com/prologic)) - Added ReviewDog CI checks - JavaScript [\#7828](https://github.com/netdata/netdata/pull/7828) ([prologic](https://github.com/prologic)) - Golang [\#7827](https://github.com/netdata/netdata/pull/7827) ([prologic](https://github.com/prologic)) - Shell scripts in PRs [\#7795](https://github.com/netdata/netdata/pull/7795) ([prologic](https://github.com/prologic)) - Stopped removing `netdata` groups/users during uninstall (Debian `postrm`) [\#7817](https://github.com/netdata/netdata/pull/7817) ([prologic](https://github.com/prologic)) - Started using the system service manager to shut down Netdata. [\#7814](https://github.com/netdata/netdata/pull/7814) ([Ferroin](https://github.com/Ferroin)) - Improved the `systemd` service files, by removing unecessary `ExecStartPre` lines and moving global options to `netdata.conf` [\#7790](https://github.com/netdata/netdata/pull/7790) ([amishmm](https://github.com/amishmm)) - Removed unnessecary `echo` calls from the updater. [\#7783](https://github.com/netdata/netdata/pull/7783) ([Ferroin](https://github.com/Ferroin)) - Fixed warnings in the Debian package build process and enabled the builds to work with older versions of `dpkg-buildpackage` by modifying the formatting of the trailer line in the Debian changelog template. [\#7763](https://github.com/netdata/netdata/pull/7763) ([Ferroin](https://github.com/Ferroin)) - Cleaned up static build process, by using `/bin/sh` and removing use of `sudo` [\#7725](https://github.com/netdata/netdata/pull/7725) ([prologic](https://github.com/prologic)) - Added auto-updates to `kickstart-static64` installations. [\#7704](https://github.com/netdata/netdata/pull/7704) ([Ferroin](https://github.com/Ferroin)) - Added static build support for Prometheus remote write [\#7691](https://github.com/netdata/netdata/pull/7691) ([Ehekatl](https://github.com/Ehekatl)) - Moved the script for installing required packages into the main repo. [\#7563](https://github.com/netdata/netdata/pull/7563) ([Ferroin](https://github.com/Ferroin)) - Updated the distribution support matrix. [#7636](https://github.com/netdata/netdata/pull/7636) ([Ferroin](https://github.com/Ferroin)) - Added Ubuntu 19.10 to packaging and lifecycle checks. [#7629](https://github.com/netdata/netdata/pull/7629) ([Ferroin](https://github.com/Ferroin) - Removed EOL distros from CI jobs. [#7628](https://github.com/netdata/netdata/pull/7628) ([Ferroin](https://github.com/Ferroin)) - Made the netdata installer more flexible, to accommodate install with ssl on MacOS [#6922](https://github.com/netdata/netdata/pull/6922) ([paulkatsoulakis](https://github.com/paulkatsoulakis)) - Improved shutdown of the Netdata agent on update and uninstall. [#7595](https://github.com/netdata/netdata/pull/7595) ([Ferroin](https://github.com/Ferroin)) - Added Fedora 31 CI integrations. [#7524](https://github.com/netdata/netdata/pull/7524) ([Ferroin](https://github.com/Ferroin)) - Removed CentOS 6 package building and lifecycle tests [#7425](https://github.com/netdata/netdata/pull/7425) ([knatsakis](https://github.com/knatsakis)), [#7430](https://github.com/netdata/netdata/pull/7430) ([ncmans](https://github.com/ncmans)) - Removed `-f` option from `groupdel` in uninstaller. [#7507](https://github.com/netdata/netdata/pull/7507) ([Ferroin](https://github.com/Ferroin)) - Injected archived backports repository on Debian Jessie for CI package builds. [#7495](https://github.com/netdata/netdata/pull/7495) ([Ferroin](https://github.com/Ferroin)) - Set the default release channel to stable [#7399](https://github.com/netdata/netdata/pull/7399) ([ncmans](https://github.com/ncmans)) - Removed EOL'd Ubuntu Trusty (14.04) from build [#7481](https://github.com/netdata/netdata/pull/7481) ([ncmans](https://github.com/ncmans)) - Corrected installer instructions during a non-privileged install [#7393](https://github.com/netdata/netdata/pull/7393) ([julidegulen](https://github.com/julidegulen)) ### Documentation - Added the step-by-step Netdata tutorial [#7489](https://github.com/netdata/netdata/pull/7489) ([joelhans](https://github.com/joelhans)) - Overhauled the installation documentation [\#7841](https://github.com/netdata/netdata/pull/7841) ([joelhans](https://github.com/joelhans)) - Refactored the collectors documentation [\#8074](https://github.com/netdata/netdata/pull/8074) ([shortpatti](https://github.com/shortpatti)), [\#8086](https://github.com/netdata/netdata/pull/8086) [\#8052](https://github.com/netdata/netdata/pull/8052) [\#7996](https://github.com/netdata/netdata/pull/7996) ([joelhans](https://github.com/joelhans)), [\#8009](https://github.com/netdata/netdata/pull/8009) [\#8005](https://github.com/netdata/netdata/pull/8005) [\#7997](https://github.com/netdata/netdata/pull/7997) ([ilyam8](https://github.com/ilyam8)) - Restructured the health documentation [#7329](https://github.com/netdata/netdata/pull/7329) ([joelhans](https://github.com/joelhans)) - Promoted DB engine/long-term metrics storage more heavily and fix misleading information [\#8031](https://github.com/netdata/netdata/pull/8031) ([joelhans](https://github.com/joelhans)), [\#8017](https://github.com/netdata/netdata/pull/8017) ([underhood](https://github.com/underhood)) - Updated eBPF docs with better install/enable instructions [\#8125](https://github.com/netdata/netdata/pull/8125) ([joelhans](https://github.com/joelhans)) - Allowed parentheses in heading links [\#7995](https://github.com/netdata/netdata/pull/7995) ([joelhans](https://github.com/joelhans)) - Fixed typos in the tutorial [\#7978](https://github.com/netdata/netdata/pull/7978) ([joelhans](https://github.com/joelhans)) - Indicated FreeIPMI supported in Docker image [\#7964](https://github.com/netdata/netdata/pull/7964) ([lassebm](https://github.com/lassebm)) - Fixed wrong code fragments in signing in to the cloud instructions [\#7950](https://github.com/netdata/netdata/pull/7950) ([cakrit](https://github.com/cakrit)) - Fixed variety of linter errors across docs [\#7944](https://github.com/netdata/netdata/pull/7944) [\#7526](https://github.com/netdata/netdata/pull/7526) [\#7407](https://github.com/netdata/netdata/pull/7407) ([joelhans](https://github.com/joelhans)) - Cleanup of macOS installation docs [\#7925](https://github.com/netdata/netdata/pull/7925) ([joelhans](https://github.com/joelhans)) - Fixed typo in PULL_REQUEST_TEMPLATE [\#7924](https://github.com/netdata/netdata/pull/7924) ([joelhans](https://github.com/joelhans)) - Added doc with post-install instructions for Google Cloud Platform [\#7912](https://github.com/netdata/netdata/pull/7912) ([joelhans](https://github.com/joelhans)) - Clarify the rules to create an alarm name [\#7911](https://github.com/netdata/netdata/pull/7911) ([thiagoftsm](https://github.com/thiagoftsm)) - Added docs about using caching proxies with our package repos. [\#7909](https://github.com/netdata/netdata/pull/7909) ([Ferroin](https://github.com/Ferroin)) - Added docs for how to build/install NetData on CentOS 8.x [\#7890](https://github.com/netdata/netdata/pull/7890) ([prologic](https://github.com/prologic)) - Clarified editing health config files in health quickstart [\#7883](https://github.com/netdata/netdata/pull/7883) ([joelhans](https://github.com/joelhans)) - Added `retroshare` collector readme [\#7849](https://github.com/netdata/netdata/pull/7849) ([ilyam8](https://github.com/ilyam8)) - Fixed typo in the SSV formatter documentation [\#7782](https://github.com/netdata/netdata/pull/7782) ([cosmix](https://github.com/cosmix)) - Added a missing parameter to the `allmetrics` endpoint documentation [\#7776](https://github.com/netdata/netdata/pull/7776) ([vlvkobal](https://github.com/vlvkobal)) - Documented how to fix the width of badges [\#7764](https://github.com/netdata/netdata/pull/7764) ([underhood](https://github.com/underhood)) - Improved styling of documentation site and stared using Algolia search [\#7753](https://github.com/netdata/netdata/pull/7753) ([joelhans](https://github.com/joelhans)) - Fixed typos in docs [\#7752](https://github.com/netdata/netdata/pull/7752) ([schneiderl](https://github.com/schneiderl)), [\#7737](https://github.com/netdata/netdata/pull/7737) ([yasharne](https://github.com/yasharne)) - Added Korean translation of some files to docs [netdata/localization issue 25](https://github.com/netdata/localization/pull/25) ([wonsangki](https://github.com/wonsangki)), [\#7723](https://github.com/netdata/netdata/pull/7723) ([cakrit](https://github.com/cakrit)) - Added better control for the introduction of new languages in docs translations [\#7722](https://github.com/netdata/netdata/pull/7722) ([cakrit](https://github.com/cakrit)) - Added a Dockerfile.docs to easily and build/rebuild docs [\#7688](https://github.com/netdata/netdata/pull/7688) ([prologic](https://github.com/prologic)) - Corrected pfSense installation instructions [\#7665](https://github.com/netdata/netdata/pull/7665) ([prologic](https://github.com/prologic)) - Fixed `buildyaml.sh` script so that docs generation works correctly. [#7662](https://github.com/netdata/netdata/pull/7662) ([Ferroin](https://github.com/Ferroin) - Fixed to new health documentation structure [#7419](https://github.com/netdata/netdata/pull/7419) ([joelhans](https://github.com/joelhans)) - Changed build process to allow apostrophes in headers [#7431](https://github.com/netdata/netdata/pull/7431) ([joelhans](https://github.com/joelhans)) - Added configuration details for vhost about DOSPageCount to Apache proxy guide [#7582](https://github.com/netdata/netdata/pull/7582) ([kkoomen](https://github.com/kkoomen)) - Added notice about mod_evasive to Apache proxy guide [#7578](https://github.com/netdata/netdata/pull/7578) ([joelhans](https://github.com/joelhans)) - Fixed broken docs builds [#7409](https://github.com/netdata/netdata/pull/7409) ([joelhans](https://github.com/joelhans)) - Fixed linter errors in packaging/docker/README [#7199](https://github.com/netdata/netdata/pull/7199) ([joelhans](https://github.com/joelhans)) - Updated the python.d README [#7357](https://github.com/netdata/netdata/pull/7357) ([OdysLam](https://github.com/OdysLam)) - Documented per-chart configuration options [#7345](https://github.com/netdata/netdata/pull/7345) ([joelhans](https://github.com/joelhans)) - Fixed typos and markup [#7368](https://github.com/netdata/netdata/pull/7368) ([nabijaczleweli](https://github.com/nabijaczleweli)), [#7375](https://github.com/netdata/netdata/pull/7375) ([rex4539](https://github.com/rex4539)) - Fixed errors in plugins.d/README.md [#7340](https://github.com/netdata/netdata/pull/7340) ([joelhans](https://github.com/joelhans)) ### Privacy - Added support for opting out of telemetry via the DO_NOT_TRACK environment variable [\#7846](https://github.com/netdata/netdata/pull/7846) [\#7929](https://github.com/netdata/netdata/pull/7929) ([prologic](https://github.com/prologic)) - Fixed typo in the installer options to disable telemetry [\#7843](https://github.com/netdata/netdata/pull/7843) ([Jiab77](https://github.com/Jiab77)) - Improved documentation of opting out of anonymous statistics [#7597](https://github.com/netdata/netdata/pull/7597) ([joelhans](https://github.com/joelhans)) - Added anon tracking notice for installers [#7437](https://github.com/netdata/netdata/pull/7437) ([ncmans](https://github.com/ncmans)) ### Other - Preparations for the next netdata cloud release. Added custom `libmosquitto`, `netdata-cli` and other prerequisites: - [\#8085](https://github.com/netdata/netdata/pull/8085) [\#8067](https://github.com/netdata/netdata/pull/8067) [\#8025](https://github.com/netdata/netdata/pull/8025) [\#8047](https://github.com/netdata/netdata/pull/8047) [#7592](https://github.com/netdata/netdata/pull/7592) [#7513](https://github.com/netdata/netdata/pull/7513) ([Ferroin](https://github.com/Ferroin)) - [\#7894](https://github.com/netdata/netdata/pull/7894) [#7682](https://github.com/netdata/netdata/pull/7682) ([stelfrag](https://github.com/stelfrag)) - [\#7836](https://github.com/netdata/netdata/pull/7836) ([thiagoftsm](https://github.com/thiagoftsm)) - [\#8030](https://github.com/netdata/netdata/pull/8030) [\#7988](https://github.com/netdata/netdata/pull/7988) [#7713](https://github.com/netdata/netdata/pull/7713) ([underhood](https://github.com/underhood)) - [\#7750](https://github.com/netdata/netdata/pull/7750) ([jacekkolasa](https://github.com/jacekkolasa)) - [#7525](https://github.com/netdata/netdata/pull/7525) ([mfundul](https://github.com/mfundul)) - [#7444](https://github.com/netdata/netdata/pull/7444) ([amoss](https://github.com/amoss)) - Improved the GitHub labeler. [\#8071](https://github.com/netdata/netdata/pull/8071) [\#8032](https://github.com/netdata/netdata/pull/8032) ([ilyam8](https://github.com/ilyam8)), [#7543](https://github.com/netdata/netdata/pull/7543) [\#7768](https://github.com/netdata/netdata/pull/7768) [\#7699](https://github.com/netdata/netdata/pull/7699) [\#7697](https://github.com/netdata/netdata/pull/7697) [#7630](https://github.com/netdata/netdata/pull/7630) [#7699](https://github.com/netdata/netdata/pull/7699) [#7697](https://github.com/netdata/netdata/pull/7697) ([Ferroin](https://github.com/Ferroin)) - Added testing section to the PR template. [\#8068](https://github.com/netdata/netdata/pull/8068) ([amoss](https://github.com/amoss)) - Applied linter fixes in shell scripts [\#7937](https://github.com/netdata/netdata/pull/7937) [\#7932](https://github.com/netdata/netdata/pull/7932) [\#7915](https://github.com/netdata/netdata/pull/7915) ([prologic](https://github.com/prologic)) - Started supporting `-fno-common` in CFLAGS [\#7870](https://github.com/netdata/netdata/pull/7870) ([\#7877](https://github.com/netdata/netdata/pull/7877) ([thiagoftsm](https://github.com/thiagoftsm)) - Completely removed the `unbound` python collector (dead code) [\#7853](https://github.com/netdata/netdata/pull/7853) ([ilyam8](https://github.com/ilyam8)) - Added possibility to change badges' text font color [\#7809](https://github.com/netdata/netdata/pull/7809) ([underhood](https://github.com/underhood)) - Small updates to sample multi-host dashboard, `dash.html` [\#7757](https://github.com/netdata/netdata/pull/7757) ([tnyeanderson](https://github.com/tnyeanderson)) - Added missing quoting in shell scripts. [#7685](https://github.com/netdata/netdata/pull/7685) ([Ferroin](https://github.com/Ferroin)) - Bump handlebars from 4.2.0 to 4.5.3 [#7654](https://github.com/netdata/netdata/pull/7654) ([dependabot[bot]](https://github.com/dependabot) - Reduce log level for `uv_thread_set_name_np` from error to info. [#7653](https://github.com/netdata/netdata/pull/7653) ([Saruspete](https://github.com/Saruspete) - Added sample cmds to get OS env in GitHub issue templates [#7550](https://github.com/netdata/netdata/pull/7550) ([Saruspete](https://github.com/Saruspete)) - Set standard name to non-libnetdata threads (libuv, pthread) ([#7584](https://github.com/netdata/netdata/pull/7584) ([Saruspete](https://github.com/Saruspete)) ## Bug fixes - Fixed problems reported by Coverity for eBPF collector plugin. [\#8135](https://github.com/netdata/netdata/pull/8135) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed `invalid literal for float\(\): NN.NNt` error in the `elasticsearch` python plugin, by adding terabyte unit parsing. [\#8013](https://github.com/netdata/netdata/pull/8013) ([blaines](https://github.com/blaines)) - Fixed `timeout` failing in docker containers which broke some python.d collectors [\#8002](https://github.com/netdata/netdata/pull/8002) ([ilyam8](https://github.com/ilyam8)) - Fixed python collectors to work on `synology6` [\#7980](https://github.com/netdata/netdata/pull/7980) ([ilyam8](https://github.com/ilyam8)) - Fixed problem with the `httpcheck` python collector not being able to check URLs with the `POST` method, by adding `body` to the `URLService` [\#7956](https://github.com/netdata/netdata/pull/7956) ([ilyam8](https://github.com/ilyam8)). Also record the new options in `httpcheck.conf` [\#7952](https://github.com/netdata/netdata/pull/7952) ([yasharne](https://github.com/yasharne)) - Fixed `netdata-updater.sh` appearing to fail [\#7955](https://github.com/netdata/netdata/pull/7955) ([ilyam8](https://github.com/ilyam8)) - Fixed error/warnings found by shellcheck for the `netdata-updater.sh` [\#7938](https://github.com/netdata/netdata/pull/7938) ([prologic](https://github.com/prologic)) - Fixed editing configuration via `edit-config`, when NetData is installed to a symlinked `/opt` [\#7933](https://github.com/netdata/netdata/pull/7933) ([prologic](https://github.com/prologic)) - Fixed installation failures due to `.keep` files [\#7829](https://github.com/netdata/netdata/pull/7829) ([prologic](https://github.com/prologic)) - Fixed installation on FreeBSD systems with non GNU sed [\#7796](https://github.com/netdata/netdata/pull/7796) ([prologic](https://github.com/prologic)) - Fixed Source0 URL in RPM spec [\#7794](https://github.com/netdata/netdata/pull/7794) ([prologic](https://github.com/prologic)) - Fixed text if current version is >= latest version and already installed [\#8078](https://github.com/netdata/netdata/pull/8078) ([prologic](https://github.com/prologic)) - Fixed CentOS 7 RPM build failures. [\#7993](https://github.com/netdata/netdata/pull/7993) ([Ferroin](https://github.com/Ferroin)) - Fixed wrong messages during the build process [\#7989](https://github.com/netdata/netdata/pull/7989) ([Ferroin](https://github.com/Ferroin)) - Fixed the unit tests for the exporting engine [\#7784](https://github.com/netdata/netdata/pull/7784) ([vlvkobal](https://github.com/vlvkobal)) - Fixed a Coverity issue with an unchecked return value [\#7780](https://github.com/netdata/netdata/pull/7780) ([vlvkobal](https://github.com/vlvkobal)) - Fixed port in use after uninstall issue, by resolving a `libuv` IPC pipe cleanup problem [\#7778](https://github.com/netdata/netdata/pull/7778) ([mfundul](https://github.com/mfundul)) - Fixed dbengine repeated global flushing errors and collectors being blocked, by dropping dirty dbengine pages if the disk cannot keep up [\#7777](https://github.com/netdata/netdata/pull/7777) ([mfundul](https://github.com/mfundul)) - Fixed issue with alarm notifications occasionally ignoring the configured severity filter when the `ROLE` was set to `root`. [\#7769](https://github.com/netdata/netdata/pull/7769) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed Netlink Connection Tracker charts in the `nfacct` plugin [\#7727](https://github.com/netdata/netdata/pull/7727) ([vlvkobal](https://github.com/vlvkobal)) - Fixed support for read-only `/lib` on SystemD systems like CoreOS in static build installation [\#7726](https://github.com/netdata/netdata/pull/7726) ([prologic](https://github.com/prologic)) - Fixed `invalid shell` installer error and netdata not starting from its installed location. [\#7698](https://github.com/netdata/netdata/pull/7698) ([Ferroin](https://github.com/Ferroin)) - Fixed metric values sent via remote write to Prometheus backends, when using average/sum [\#7694](https://github.com/netdata/netdata/pull/7694) ([Ehekatl](https://github.com/Ehekatl)) - Fixed unclosed brackets in softnet alarm [\#7693](https://github.com/netdata/netdata/pull/7693) ([Ehekatl](https://github.com/Ehekatl)) - Fixed SEGFAULT when localhost initialization failed [\#7663](https://github.com/netdata/netdata/pull/7663) ([underhood](https://github.com/underhood)) - Fixed the handling of permissions in the installer script and the RPM spec file so that theya re consistent with each other and with a clean install done with `make install`. [\#7632](https://github.com/netdata/netdata/pull/7632) ([Ferroin](https://github.com/Ferroin)) - Reduced the number of `broken pipe` error log entries, after a SIGKILL [\#7588](https://github.com/netdata/netdata/pull/7588) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed a syntax error in the packaging functions. [#7686](https://github.com/netdata/netdata/pull/7686) ([Ferroin](https://github.com/Ferroin)) - Fixed Coverity errors by restoring support for protobuf 3.0 [#7683](https://github.com/netdata/netdata/pull/7683) ([vlvkobal](https://github.com/vlvkobal)) - Fixed inability to disable Prometheus remote API [#7674](https://github.com/netdata/netdata/pull/7674) ([candrews](https://github.com/candrews)) - Fixed SEGFAULT from the `cpuidle` plugin [#7664](https://github.com/netdata/netdata/pull/7664) ([Saruspete](https://github.com/Saruspete)) - Fixed samba collector not working, due to inability to run `sudo` [#7655](https://github.com/netdata/netdata/pull/7655) ([ilyam8](https://github.com/ilyam8)) - Fixed invalid css/js resource errors when URL for slave node has no final / on streaming master [#7643](https://github.com/netdata/netdata/pull/7643) ([underhood](https://github.com/underhood)) - Fixed `keys_redis` chart in the `redis` collector, by populating keys at runtime [#7639](https://github.com/netdata/netdata/pull/7639) ([ilyam8](https://github.com/ilyam8)) - Fixed UrlService bytes decoding and logger unicode encoding in the python.d plugin [#7601](https://github.com/netdata/netdata/pull/7601) [#7614](https://github.com/netdata/netdata/pull/7614) ([ilyam8](https://github.com/ilyam8)), [#7376](https://github.com/netdata/netdata/pull/7376) ([vzDevelopment](https://github.com/vzDevelopment)) - Fixed a warning in the prometheus remote write backend [#7609](https://github.com/netdata/netdata/pull/7609) ([vlvkobal](https://github.com/vlvkobal)) - Fixed not detecting more than one adapter in the `hpssa` collector [#7580](https://github.com/netdata/netdata/pull/7580) ([gnoddep](https://github.com/gnoddep))- Fixed race condition in dbengine [#7565](https://github.com/netdata/netdata/pull/7565) ([thiagoftsm](https://github.com/thiagoftsm)) - Fixed race condition with the dbenging page cache descriptors [#7478](https://github.com/netdata/netdata/pull/7478) ([mfundul](https://github.com/mfundul)) - Fixed dbengine dirty page flushing warning [#7469](https://github.com/netdata/netdata/pull/7469) ([mfundul](https://github.com/mfundul)) - Fixed missing parenthesis on alarm softnet.conf [#7476](https://github.com/netdata/netdata/pull/7476) ([Steve8291](https://github.com/Steve8291)) - Fixed race condition in the dbengine [#7533](https://github.com/netdata/netdata/pull/7533) ([mfundul](https://github.com/mfundul)) - Fixed "Master thread EXPORTING takes too long to exit. Giving up" error, by cleaning up the main exporting engine thread on exit [#7558](https://github.com/netdata/netdata/pull/7558) ([vlvkobal](https://github.com/vlvkobal)) - Fixed rabbitmq error "update() unhandled exception: invalid literal for int() with base 10" [#7464](https://github.com/netdata/netdata/pull/7464) ([ilyam8](https://github.com/ilyam8)) - Fixed some LGTM alerts [#7441](https://github.com/netdata/netdata/pull/7441) ([jacekkolasa](https://github.com/jacekkolasa)) - Fixed valgrind errors [#7532](https://github.com/netdata/netdata/pull/7532) ([mfundul](https://github.com/mfundul)) - Fixed monit collector LGTM warnings ([#7387](https://github.com/netdata/netdata/pull/7387) ([ilyam8](https://github.com/ilyam8)) - Fixed the following go.d.plugin collector issues: - mysql: panic in Cleanup (#326) - unbound: gather metrics via unix socket (#319) - logstash: pipelines chart (#317) - unbound: configuration file parsing. Support include mechanism. (#298) - logstash: pipelines metrics parsing (#293) - phpfpm: processes metrics parsing (#297) 2020-02-21T03:43:30+00:00 netdata v1.21.0 netdata v1.21.0 2020-04-06T03:12:48+00:00 # Netdata v1.21.0 Release v1.21.0 contains 2 new collectors, 3 new exporting connectors, 37 bug fixes, 46 improvements, and 25 documentation updates. We also made 26 bug fixes or improvements related to the upcoming release of Netdata Cloud. ## At a glance We added a new **collector for Apache Pulsar**, a popular open-source distributed pub-sub messaging system. We use Pulsar in our Netdata Cloud infrastructure (more on that later this month!), and are excited to start sharing metrics about our own Pulsar systems when the time comes. The Pulsar collector attempts to auto-detect any running Pulsar processes, but you can always [configure the collector](https://docs.netdata.cloud/collectors/go.d.plugin/modules/pulsar/#configuration) based on your setup. Also new in v1.21 is a **VerneMQ collector**. We use the open-source MQ Telemetry Transport (MQTT) broker for Netdata Cloud as well. As with Pulsar, you can [configure the VerneMQ collector](https://docs.netdata.cloud/collectors/go.d.plugin/modules/vernemq/#vernemq-monitoring-with-netdata) to auto-detect your installation in just a few steps. Our experimental exporting engine received significant updates with new connectors for **[Prometheus remote write](https://docs.netdata.cloud/exporting/prometheus/remote_write/)**, **[MongoDB](https://docs.netdata.cloud/exporting/mongodb/)**, and **[AWS Kinesis Data Streams](https://docs.netdata.cloud/exporting/aws_kinesis/)**. You can now send Netdata metrics to more than 20 additional external storage providers for long-term archiving and deeper analysis. Learn more about the [exporting engine](https://docs.netdata.cloud/exporting/) in our documentation. We upgraded our **TLS compatibility to include 1.3**, which applies to HTTPS for both Netdata's web server and streaming connections. TLS 1.3 is the most up-to-date version of the TLS protocol, and contains important fixes and improvements to ensure strong encryption. If you enabled TLS in the web server or streaming, Netdata attempts to use 1.3 by default, but you can also set the version and ciphers explicitly. Learn more in the [documentation](https://docs.netdata.cloud/web/server/#select-tls-version). The Netdata dashboard has been **completely re-written in React**. While the look and behavior hasn't changed, these under-the-hood changes enable a suite of new features, UX improvements, and design overhauls. With React, we'll be able to work faster and better resource our talented engineers. As part of the ongoing work to polish our **eBPF collector tech preview**, we've now proven the collector's performance is very good, and have vastly expanded the number of operating system versions the collector works on. Learn how to [enable it](https://docs.netdata.cloud/collectors/ebpf_process.plugin/) in our documentation. We've also extensively stress-tested the eBPF collector and found that it's impressively fast given the depth of metrics it collects! Read up on our benchmarking analysis [on GitHub](https://github.com/netdata/netdata/issues/8195). ## Acknowledgments - [Jiab77](https://github.com/Jiab77) for helping remove extra printed `\n` in various installation methods. - [SamK](https://github.com/SamK) for fixing missing folders in `/var/` for .deb installations. - [kevenwyld](https://github.com/kevenwyld) for improving Netdata's support of RHEL distributions. - [WoozyMasta](https://github.com/WoozyMasta) for adding in the ability to get Kubernetes pod names with `kubectl` in bare-metal deployments. - [paulmezz](https://github.com/paulmezz) for adding the ability to to connect to non-admin user IDs when trying to collect metrics from a Ceph storage cluster. - [ManuelPombo](https://github.com/ManuelPombo) for adding additional charts to our Postgres collector, and [anayrat](https://github.com/anayrat) for helping review the changes. - [Default](https://github.com/DefauIt) for adding lsyncd to the backup group in `apps.plugin`. - [bceylan](https://github.com/bceylan), [peroxy](https://github.com/peroxy), [toadjaune](https://github.com/toadjaune), [grinapo](https://github.com/grinapo), [m-rey](https://github.com/m-rey), and [YorikSar](https://github.com/YorikSar) for documentation fixes. ## Breaking changes None. ## Improvements - Extended TLS support for 1.3. ([\#8505](https://github.com/netdata/netdata/pull/8505)) by [thiagoftsm](https://github.com/thiagoftsm) - Switched to the React dashboard code as the default dashboard. ([\#8363](https://github.com/netdata/netdata/pull/8363)) by [Ferroin](https://github.com/Ferroin) ### Collectors - Added a new Pulsar collector. ([\#8364](https://github.com/netdata/netdata/pull/8364)) by [ilyam8](https://github.com/ilyam8) - Added a new VerneMQ collector. ([\#8236](https://github.com/netdata/netdata/pull/8236)) by [ilyam8](https://github.com/ilyam8) - Added high precision timer support for plugins such as `idlejitter`. ([\#8441](https://github.com/netdata/netdata/pull/8441)) by [mfundul](https://github.com/mfundul) - Added an alarm to the `dns_query` collector that detects DNS query failure. ([\#8434](https://github.com/netdata/netdata/pull/8434)) by [ilyam8](https://github.com/ilyam8) - Added the ability to get the pod name from cgroup with `kubectl` in bare-metal deployments. ([\#7416](https://github.com/netdata/netdata/pull/7416)) by [WoozyMasta](https://github.com/WoozyMasta) - Added the ability to connect to non-admin user IDs for a Ceph storage cluster. ([\#8276](https://github.com/netdata/netdata/pull/8276)) by [paulmezz](https://github.com/paulmezz) - Added connections (backend) usage to Postgres monitoring. ([\#8126](https://github.com/netdata/netdata/pull/8126)) by [ManuelPombo](https://github.com/ManuelPombo) - eBPF: Added support for additional Linux kernels found in Debian 10.2 and Ubuntu 18.04. ([\#8192](https://github.com/netdata/netdata/pull/8192)) by [thiagoftsm](https://github.com/thiagoftsm) ### Packaging/installation - Added missing override for Ubuntu Eoan. ([\#8547](https://github.com/netdata/netdata/pull/8547)) by [prologic](https://github.com/prologic) - Added Docker build arguments to pass extra options to Netdata installer. ([\#8472](https://github.com/netdata/netdata/pull/8472)) by [Ferroin](https://github.com/Ferroin) - Added deferred error message handling to the installer. ([\#8381](https://github.com/netdata/netdata/pull/8381)) by [Ferroin](https://github.com/Ferroin) - Fixed cosmetic error checking for CentOS 8 version in `install-required-packages.sh`. ([\#8339](https://github.com/netdata/netdata/pull/8339)) by [prologic](https://github.com/prologic) - Added various fixes and improvements to the installers. ([\#8315](https://github.com/netdata/netdata/pull/8315)) by [Ferroin](https://github.com/Ferroin) - Migrated to installing only Python 3 packages during installation. ([\#8318](https://github.com/netdata/netdata/pull/8318)) by [Ferroin](https://github.com/Ferroin) - Improved support for RHEL by not installing the CUPS plugin when v1.7 of CUPS cannot be installed. ([\#7216](https://github.com/netdata/netdata/pull/7216)) by [kevenwyld](https://github.com/kevenwyld) - Added support for Clear Linux in `install-required-packages.sh`. ([\#8154](https://github.com/netdata/netdata/pull/8154)) by [Ferroin](https://github.com/Ferroin) - Removed Fedora 29 from CI and packaging. ([\#8100](https://github.com/netdata/netdata/pull/8100)) by [Ferroin](https://github.com/Ferroin) - Removed Ubuntu 19.04 from CI and packaging. ([\#8040](https://github.com/netdata/netdata/pull/8040)) by [Ferroin](https://github.com/Ferroin) - Removed OpenSUSE Leap 15.0 from CI. ([\#7990](https://github.com/netdata/netdata/pull/7990)) by [Ferroin](https://github.com/Ferroin) ### Exporting - Added a MongoDB connector to the exporting engine. ([\#8416](https://github.com/netdata/netdata/pull/8416)) by [vlvkobal](https://github.com/vlvkobal) - Added a Prometheus Remote Write connector to the exporting engine. ([\#8292](https://github.com/netdata/netdata/pull/8292)) by [vlvkobal](https://github.com/vlvkobal) - Added an AWS Kinesis connector to the exporting engine. ([\#8145](https://github.com/netdata/netdata/pull/8145)) by [vlvkobal](https://github.com/vlvkobal) ### Documentation - Fixed typo in main `README.md`. ([\#8547](https://github.com/netdata/netdata/pull/8547)) by [bceylan](https://github.com/bceylan) - Updated the update instructions with per-method details. ([\#8394](https://github.com/netdata/netdata/pull/8394)) by [joelhans](https://github.com/joelhans) - Updated paragraph on `install-required-packages.sh`. ([\#8347](https://github.com/netdata/netdata/pull/8347)) by [prologic](https://github.com/prologic) - Added Patti's dashboard video to the documentation. ([\#8385](https://github.com/netdata/netdata/pull/8385)) by [joelhans](https://github.com/joelhans) - Fixed go.d modules in the `COLLECTORS.md`. ([\#8380](https://github.com/netdata/netdata/pull/8380)) by [ilyam8](https://github.com/ilyam8) - Added frontmatter to all documentation in bulk. ([\#8354](https://github.com/netdata/netdata/pull/8354)) and ([\#8372](https://github.com/netdata/netdata/pull/8372)) by [joelhans](https://github.com/joelhans) - Fixed MDX parsing in installation guide. ([\#8362](https://github.com/netdata/netdata/pull/8362)) by [joelhans](https://github.com/joelhans) - Fixed typo in eBPF documentation. ([\#8360](https://github.com/netdata/netdata/pull/8360)) by [ilyam8](https://github.com/ilyam8) - Fixed links in packaging/installer to work on GitHub and docs. ([\#8319](https://github.com/netdata/netdata/pull/8319)) by [joelhans](https://github.com/joelhans) - Fixed typo in main `README.md`. ([\#8335](https://github.com/netdata/netdata/pull/8335)) by [peroxy](https://github.com/peroxy) - Removed mention saying that .deb packages are experimental. ([\#8250](https://github.com/netdata/netdata/pull/8250)) by [toadjaune](https://github.com/toadjaune) - Added standards for abbreviations/acronyms to docs style guide. ([\#8313](https://github.com/netdata/netdata/pull/8313)) by [joelhans](https://github.com/joelhans) - Tweaked eBPF documentation, and added performance data. ([\#8261](https://github.com/netdata/netdata/pull/8261)) by [joelhans](https://github.com/joelhans) - Added requirements for the exim collector. ([\#8096](https://github.com/netdata/netdata/pull/8096)) by [petarkozic](https://github.com/petarkozic) - Fixed misspelling of openSUSE and SUSE. ([\#8233](https://github.com/netdata/netdata/pull/8233)) by [m-rey](https://github.com/m-rey) - Added OpenGraph tags to documentation pages. ([\#8224](https://github.com/netdata/netdata/pull/8224)) by [joelhans](https://github.com/joelhans) - Fixed typo in custom dashboard documentation. ([\#8213](https://github.com/netdata/netdata/pull/8213)) by [shortpatti](https://github.com/shortpatti) - Removed extra asterisks in main README. ([\#8193](https://github.com/netdata/netdata/pull/8193)) by [grinapo](https://github.com/grinapo) - Added eBPF README to documentation navigation and improved page title. ([\#8191](https://github.com/netdata/netdata/pull/8191)) by [joelhans](https://github.com/joelhans) - Fixed figure+image without closing tag in new documentation. ([\#8177](https://github.com/netdata/netdata/pull/8177)) by [joelhans](https://github.com/joelhans) - Corrected instructions for running Netdata behind Apache. ([\#8169](https://github.com/netdata/netdata/pull/8169)) by [cakrit](https://github.com/cakrit) - Added PR title guidelines to the contribution guidelines to make `CHANGELOG.md` more meaningful. ([\#8150](https://github.com/netdata/netdata/pull/8510)) by [cakrit](https://github.com/cakrit) - Fixed formatting in Custom dashboards documentation. ([\#8102](https://github.com/netdata/netdata/pull/)) by [YorikSar](https://github.com/YorikSar) - Updated the manual install documentation with better information about CentOS 6. ([\#8088](https://github.com/netdata/netdata/pull/8088)) by [Ferroin](https://github.com/Ferroin) - Added tutorials to support v1.20 release ([\#7943](https://github.com/netdata/netdata/pull/7943)) by [joelhans](https://github.com/joelhans) ### CI/CD - Added logic to bail early on LWS build if cmake is not present. ([\#8559](https://github.com/netdata/netdata/pull/8559)) by [Ferroin](https://github.com/Ferroin) - Added `python.d` configuration files to YAML linting CI process and increase line limit to 120 characters. ([\#8541](https://github.com/netdata/netdata/pull/8541)) and ([\#8542](https://github.com/netdata/netdata/pull/8542)) by [ilyam8](https://github.com/ilyam8) - Cleaned up GitHub Actions workflows. ([\#8383](https://github.com/netdata/netdata/pull/8383)) by [Ferroin](https://github.com/Ferroin) - Migrated tests from Travis CI to Github Workflows. ([\#8331](https://github.com/netdata/netdata/pull/8331)) by [prologic](https://github.com/prologic) - Covered `install-required-packages.sh` with Coverity scan. ([\#8388](https://github.com/netdata/netdata/pull/8388)) by [prologic](https://github.com/prologic) - Added support for cross-host docker-compose builds. ([\#7754](https://github.com/netdata/netdata/pull/7754)) by [amoss](https://github.com/amoss) - Reconfigured Travis CI to retry transient failures on lifecycle tests. ([\#8203](https://github.com/netdata/netdata/pull/8203)) by [prologic](https://github.com/prologic) - Switched to checkout@v2 in GitHub Actions. ([\#8170](https://github.com/netdata/netdata/pull/8170)) by [ilyam8](https://github.com/ilyam8) ### Other - Added lsyncd to the backup group in `apps.plugin`. ([\#8159](https://github.com/netdata/netdata/pull/8159)) by [Default](https://github.com/DefauIt) ## Netdata Cloud - Fixed compiler warnings in the claiming code. ([\#8567](https://github.com/netdata/netdata/pull/8567)) by [vlvkobal](https://github.com/vlvkobal) - Fixed regressions in cloud functionality (build, CI, claiming). ([\#8568](https://github.com/netdata/netdata/pull/8568)) by [underhood](https://github.com/underhood) - Switched over to soft feature flag. ([\#8545](https://github.com/netdata/netdata/pull/8545)) by [amoss](https://github.com/amoss) - Improved claiming behavior to run as `netdata` user by default, or override if necessary. ([\#8516](https://github.com/netdata/netdata/pull/8516)) by [amoss](https://github.com/amoss) - Updated the `info` endpoint for Cloud notifications. ([\#8519](https://github.com/netdata/netdata/pull/8519)) by [amoss](https://github.com/amoss) - Added correct error logging for ACLK challenge/response. ([\#8538](https://github.com/netdata/netdata/pull/8538)) by [stelfrag](https://github.com/stelfrag) - Cleaned up Cloud configuration files to move `[agent_cloud_link]` settings to `[cloud]`. ([\#8501](https://github.com/netdata/netdata/pull/8501)) by [underhood](https://github.com/underhood) - Enhanced ACLK header payload to include `timestamp-offset-usec`. ([\#8499](https://github.com/netdata/netdata/pull/8499)) by [stelfrag](https://github.com/stelfrag) - Added ACLK build failures to anonymous statistics. ([\#8429](https://github.com/netdata/netdata/pull/8429)) by [underhood](https://github.com/underhood) - Added ACLK connection failures to anonymous statistics. ([\#8456](https://github.com/netdata/netdata/pull/8456)) by [underhood](https://github.com/underhood) - Added HTTP proxy support to ACLK. ([\#8406](https://github.com/netdata/netdata/pull/8406))/([\#8418](https://github.com/netdata/netdata/pull/8418)) by [underhood](https://github.com/underhood) - Improved ownership of the `claim.d` directory. ([\#8475](https://github.com/netdata/netdata/pull/8475)) by [amoss](https://github.com/amoss) - Fixed the ACLK response payload to match the new specification. ([\#8420](https://github.com/netdata/netdata/pull/8420)) by [stelfrag](https://github.com/stelfrag) - Added the new cloud info in the info endpoint. ([\#8430](https://github.com/netdata/netdata/pull/8430)) by [amoss](https://github.com/amoss) - Implemented ACLK Last Will and Testament. ([\#8410](https://github.com/netdata/netdata/pull/8410)) by [stelfrag](https://github.com/stelfrag) - Fixed JSON parsing in ACLK. ([\#8426](https://github.com/netdata/netdata/pull/8426)) by [stelfrag](https://github.com/stelfrag) - Fixed outstanding problems in claiming and add SOCKS5 support. ([\#8406](https://github.com/netdata/netdata/pull/8406))/([\#8404](https://github.com/netdata/netdata/pull/8404)) by [amoss](https://github.com/amoss) and [underhood](https://github.com/underhood) - Fixed the type value for alarm updates in the ACLK. ([\#8403](https://github.com/netdata/netdata/pull/8403)) by [stelfrag](https://github.com/stelfrag) - Improved performance of ACLK. ([\#8399](https://github.com/netdata/netdata/pull/8399))/([\#8401](https://github.com/netdata/netdata/pull/8401)) by [amoss](https://github.com/amoss) - Improved the ACLK's agent "pop-corning" phase. ([\#8398](https://github.com/netdata/netdata/pull/8398)) by [stelfrag](https://github.com/stelfrag) - Improved ACLK according to results of the smoke-test. ([\#8358](https://github.com/netdata/netdata/pull/8358)) by [amoss](https://github.com/amoss) and [underhood](https://github.com/underhood) - Added code to bundle LWS in binary packages. ([\#8255](https://github.com/netdata/netdata/pull/8255)) by [Ferroin](https://github.com/Ferroin) - Added libwebsockets files to `make dist`. ([\#8275](https://github.com/netdata/netdata/pull/8275)) by [Ferroin](https://github.com/Ferroin) - Adapted the claiming script to new API responses. ([\#8245](https://github.com/netdata/netdata/pull/8245)) by [hmoragrega](https://github.com/hmoragrega) - Fixed claiming script to reflect Netdata Cloud API changes. ([\#8220](https://github.com/netdata/netdata/pull/8220)) by [cosmix](https://github.com/cosmix) - Added libwebsockets bundling code to `netdata-installer.sh`. ([\#8144](https://github.com/netdata/netdata/pull/8144)) by [Ferroin](https://github.com/Ferroin) ## Bug fixes - Removed notifications from the dashboard and fixed the `/default.html` route. ([\#8599](https://github.com/netdata/netdata/pull/8599) by [jacekkolasa](https://github.com/jacekkolasa) - Fixed `help-tooltips` styling, private registry node deletion, and the right-hand sidebar "jumping" on document clicks. ([\#8553](https://github.com/netdata/netdata/pull/8553) by [jacekkolasa](https://github.com/jacekkolasa) - Fixed errors reported by Coverity. ([\#8593](https://github.com/netdata/netdata/pull/8593)) by [thiagoftsm](https://github.com/thiagoftsm), ([\#8579](https://github.com/netdata/netdata/pull/8579)) by [amoss](https://github.com/amoss), and ([\#8586](https://github.com/netdata/netdata/pull/8586)) by [thiagoftsm](https://github.com/thiagoftsm) - Added `netdata.service.*` to `.gitignore` to hide `system/netdata.service.v235` file. ([\#8556](https://github.com/netdata/netdata/pull/8556)) by [vlvkobal](https://github.com/vlvkobal) - Fixed Debian 8 (Jessie) support. ([\#8590](https://github.com/netdata/netdata/pull/8590)) and ([\#8593](https://github.com/netdata/netdata/pull/8593)) by [prologic](https://github.com/prologic) - Fixed broken Fedora 30/31 RPM builds. ([\#8572](https://github.com/netdata/netdata/pull/8572)) by [prologic](https://github.com/prologic) - Fixed broken pipe ignoring in `apps.plugin`. ([\#8554](https://github.com/netdata/netdata/pull/8554)) by [vlvkobal](https://github.com/vlvkobal) - Fixed the `bytespersec` chart context in the Python Apache collector. ([\#8550](https://github.com/netdata/netdata/pull/8550)) by [ilyam8](https://github.com/ilyam8) - Fixed `charts.d.plugin` to exit properly during Netdata service restart. ([\#8529](https://github.com/netdata/netdata/pull/8529)) by [ilyam8](https://github.com/ilyam8) - Fixed minimist dependency vulnerability. ([\#8537](https://github.com/netdata/netdata/pull/8537)) by [jacekkolasa](https://github.com/jacekkolasa) - Fixed our Debian/Ubuntu packages to package the expected systemd unit files. ([\#8468](https://github.com/netdata/netdata/pull/8468)) by [prologic](https://github.com/prologic) - Fixed auto-updates for static (`kickstart-static64.sh`) installs. ([\#8507](https://github.com/netdata/netdata/pull/8507)) by [prologic](https://github.com/prologic) - Fixed openSUSE 15.1 RPM package builds. ([\#8494](https://github.com/netdata/netdata/pull/8494)) by [prologic](https://github.com/prologic) - Fixed how SimpleService truncates Python module names. ([\#8492](https://github.com/netdata/netdata/pull/8492)) by [ilyam8](https://github.com/ilyam8) - Removed erroneous `\n` in uninstaller output. ([\#8446](https://github.com/netdata/netdata/pull/8446)) by [prologic](https://github.com/prologic) - Fixed `install-required-packages` script to self-update `apt`. ([\#8491](https://github.com/netdata/netdata/pull/8491)) by [prologic](https://github.com/prologic) - Added proper prefix to Python module names during loading. ([\#8474](https://github.com/netdata/netdata/pull/8474)) by [ilyam8](https://github.com/ilyam8) - Fixed how the Netdata updater script cleans up after being run. ([\#8414](https://github.com/netdata/netdata/pull/8414)) by [prologic](https://github.com/prologic) - Fixed the flushing error threshold with the database engine. ([\#8425](https://github.com/netdata/netdata/pull/8425)) by [mfundul](https://github.com/mfundul) - Fixed memory leak for host labels streaming from slaves to master. ([\#8460](https://github.com/netdata/netdata/pull/8460)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed support for uninstalling the eBPF collector in the uninstaller. ([\#8444](https://github.com/netdata/netdata/pull/8444)) by [prologic](https://github.com/prologic) - Fixed a bug involving `stop_all_netdata uv_pipe_connect()` in the installer. ([\#8444](https://github.com/netdata/netdata/pull/8444)) by [prologic](https://github.com/prologic) - Fixed installer output regarding newlines. ([\#8447](https://github.com/netdata/netdata/pull/8447)) by [prologic](https://github.com/prologic) - Fixed broken dependencies for Ubuntu 19.10. ([\#8397](https://github.com/netdata/netdata/pull/8397)) by [prologic](https://github.com/prologic) - Fixed streaming scaling. ([\#8375](https://github.com/netdata/netdata/pull/8375)) by [mfundul](https://github.com/mfundul) - Fixed missing characters in kernel version field by encoding slave fields. ([\#8216](https://github.com/netdata/netdata/pull/8216)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed installation for Ubuntu 14.04 ([\#7690](https://github.com/netdata/netdata/pull/7690)) by [Ehekatl](https://github.com/Ehekatl) - Fixed dependencies for Debian Jessie. ([\#8290](https://github.com/netdata/netdata/pull/8290)) by [Ferroin](https://github.com/Ferroin) - Fixed dependency names for Arch Linux. ([\#8334](https://github.com/netdata/netdata/pull/8334)) by [Ferroin](https://github.com/Ferroin) - Removed extra printed `\n` in various installers. ([\#8324](https://github.com/netdata/netdata/pull/8324))/([\#8325](https://github.com/netdata/netdata/pull/8325))/([\#8326](https://github.com/netdata/netdata/pull/8326)) by [Jiab77](https://github.com/Jiab77) - Fixed missing folders in `/var/` for .deb packages. ([\#8314](https://github.com/netdata/netdata/pull/8314)) by [SamK](https://github.com/SamK) - Fixed Ceph collector to get `osd_perf_infos` in versions 14.2 and higher. ([\#8248](https://github.com/netdata/netdata/pull/8248)) by [ilyam8](https://github.com/ilyam8) - Fixed RHEL / CentOS 8.x dependencies for Judy-devel and others.([\#8202](https://github.com/netdata/netdata/pull/8202)) by [prologic](https://github.com/prologic) - Removed extraneous commas from chart information in dashboard. ([\#8266](https://github.com/netdata/netdata/pull/8266)) by [FlyingSixtySix](https://github.com/FlyingSixtySix) - Removed `tmem` collection from xenstat_plugin to allow Netdata on Xen 4.13 to compile successfully. ([\#7951](https://github.com/netdata/netdata/pull/7951)) by [rushikeshjadhav](https://github.com/rushikeshjadhav) - Fixed `get_latest_version` for nightly channel update script. ([\#8172](https://github.com/netdata/netdata/pull/8172)) by [ilyam8](https://github.com/ilyam8) - Restricted messages to Google Analytics. ([\#8161](https://github.com/netdata/netdata/pull/8161)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed Python 3 dict access in OpenLDAP collector module. ([\#8162](https://github.com/netdata/netdata/pull/8162)) by [Mic92](https://github.com/Mic92) 2020-04-06T03:12:48+00:00 netdata v1.21.1 netdata v1.21.1 2020-04-13T16:14:20+00:00 # Netdata v1.21.1 Release v1.21.1 is a hotfix release to improve the performance of the new React dashboard, which was merged and enabled by default in v1.21.0. The React dashboard shipped in v1.21.0 did not properly freeze charts that were outside of the browser's viewport. If a user who loaded many charts by scrolling through the dashboard, charts outside of their browser's viewport continued updating. This excess of chart updates caused _all_ charts to update more slowly than every second. v.1.21.1 includes improvements to the way the Netdata dashboard freezes, maintains state, and restores charts as users scroll. 2020-04-13T16:14:20+00:00 netdata v1.22.0 netdata v1.22.0 2020-05-11T13:14:37+00:00 # Release v1.22.0 Release v1.22.0 marks the official launch of our rearchitected Netdata Cloud! This Agent release contains both backend and interface changes necessary to connect your distributed nodes to this dramatically improved experience. Netdata Cloud builds on top of our open source monitoring Agent to give you real-time visibility for your entire infrastructure. Once you've connected your Agents to Cloud, you can view key metrics, insightful charts, and active alarms from all your nodes in a single web interface. When an anomaly strikes, seamlessly navigate to any node to troubleshoot and discover the root cause with the familiar Netdata dashboard. ![Animated GIF of Netdata Cloud](https://user-images.githubusercontent.com/1153921/80828986-1ebb3b00-8b9b-11ea-957f-2c8d0d009e44.gif) **[Sign in to Cloud](https://app.netdata.cloud)** and read our [Get started with Cloud](https://learn.netdata.cloud/docs/cloud/get-started/) guide for details on updating your nodes, claiming them, and navigating the new Cloud. While Netdata Cloud offers a centralized method of monitoring your Agents, your metrics data is not stored or centralized in any way. Metrics data remains with your nodes and is only streamed to your browser through Cloud. In addition, Cloud only expands on the functionality of the wildly popular free and open source Agent. We will never make any of our open source Agent features Cloud-exclusive, and we will actively continue to develop the Agent so that we can integrate new features with Netdata Cloud. This release also contains 1 new collector, 1 new exporting connector, 1 new alarm notification method, 27 improvements, 16 documentation updates, and 22 bug fixes. ## At a glance We added a new collector called `whoisquery` that helps you **monitor a domain name's expiration date**. You can track as many domains as you'd like, and set custom warning and critical thresholds for each. For more information on setup and configuration, see the [Whois domain expiry monitoring documentation](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/whoisquery/). We added a new connector to our experimental exporting engine: **[Prometheus remote write](https://learn.netdata.cloud/docs/agent/exporting/prometheus/remote_write/)**. You can use this connector to send Netdata metrics to your choice of more than 20 external storage providers for long-term archiving and further analysis. Our new documentation experience is now available at **[Netdata Learn](https://learn.netdata.cloud)**! We encourage you to try it out and give us feedback or ask questions in our [GitHub issues](https://github.com/netdata/netdata/issues/new/choose). Learn features documentation for both the Agent and Cloud in separate-but-connected vaults, which streamlines the experience of learning about both products. While Learn only features documentation for now, we plan on releasing more types of educational content serving the Agent's open-source community of developers, sysadmins, and DevOps folks. We'll have more to announce soon, but in the meantime, we hope you enjoy what we believe is a smoother (and prettier) docs experience. ## Acknowledgments - [amishmm](https://github.com/amishmm) for updating `netdata.conf` and `netdata.service.v235.in`. - [adamwolf](https://github.com/adamwolf) for fixing a typo in `netdata-installer.sh`. - [lassebm](https://github.com/lassebm) for fixing a crash when shutting down an Agent with the ACLK disabled. - [yasharne](https://github.com/yasharne) for adding a new whoisquery collector and for adding health alarm templates for both the whoisquery and x509check collectors. - [illumine](https://github.com/illumine) for adding Dynatrace as a new alarm notification method. - [slavaGanzin](https://github.com/slavaGanzin), [carehart](https://github.com/carehart), [Jiab77](https://github.com/Jiab77), and [IceCodeNew](https://github.com/IceCodeNew) for documentation fixes and improvements. ## Breaking changes - The previous iteration of Netdata Cloud, accessible through various **Sign in** and **Nodes view (beta)** buttons on the Agent dashboard, is deprecated in favor of the new Cloud experience. - Our old documentation site (`docs.netdata.cloud`) was replaced with [**Netdata Learn**](https://learn.netdata.cloud). All existing backlinks redirect to the new site. - Our [localization project](https://github.com/netdata/localization) is no longer actively maintained. We're grateful for the hard work of its contributors. ## Improvements ### Netdata Cloud - Enabled support for Netdata Cloud. ([\#8478](https://github.com/netdata/netdata/pull/8478)), ([\#8836](https://github.com/netdata/netdata/pull/8836)), ([\#8843](https://github.com/netdata/netdata/pull/8843)), ([\#8838](https://github.com/netdata/netdata/pull/8838)), ([\#8840](https://github.com/netdata/netdata/pull/8840)), ([\#8850](https://github.com/netdata/netdata/pull/8850)), ([\#8853](https://github.com/netdata/netdata/pull/8853)), ([\#8866](https://github.com/netdata/netdata/pull/8866)), ([\#8871](https://github.com/netdata/netdata/pull/8871)), ([\#8858](https://github.com/netdata/netdata/pull/8858)), ([\#8870](https://github.com/netdata/netdata/pull/8870)), ([\#8904](https://github.com/netdata/netdata/pull/8904)), ([\#8895](https://github.com/netdata/netdata/pull/8895)), ([\#8927](https://github.com/netdata/netdata/pull/8927)), ([\#8944](https://github.com/netdata/netdata/pull/8944)) by [amoss](https://github.com/amoss), [jacekkolasa](https://github.com/jacekkolasa), [Ferroin](https://github.com/Ferroin), [prologic](https://github.com/prologic), [mfundul](https://github.com/mfundul), [underhood](https://github.com/underhood), and [stelfrag](https://github.com/stelfrag). - Added TTL headers to ACLK responses. ([\#8760](https://github.com/netdata/netdata/pull/8760)) by [amoss](https://github.com/amoss) - Improved the thread exit fixes in [\#8750](https://github.com/netdata/netdata/pull/8750). ([\#8750](https://github.com/netdata/netdata/pull/8750)) by [amoss](https://github.com/amoss) - Added support for building libmosquitto on FreeBSD/macOS. ([\#8254](https://github.com/netdata/netdata/pull/8254)) by [Ferroin](https://github.com/Ferroin) - Improved ACLK reconnection sequence. ([\#8729](https://github.com/netdata/netdata/pull/8729)) by [stelfrag](https://github.com/stelfrag) - Improved ACLK memory management and shutdown sequence. ([\#8611](https://github.com/netdata/netdata/pull/8611)) by [stelfrag](https://github.com/stelfrag) - Added `session-id` to ACLK using connect timestamp. ([\#8633](https://github.com/netdata/netdata/pull/8633)) by [amoss](https://github.com/amoss) ### Collectors - Improved the index size for the eBPF collector. ([\#8743](https://github.com/netdata/netdata/pull/8743)) by [thiagoftsm](https://github.com/thiagoftsm) - Added health alarm templates for the whoisquery collector. ([\#8700](https://github.com/netdata/netdata/pull/8700)) by [yasharne](https://github.com/yasharne) - Added a whoisquery collector. [go.d.plugin/#368](https://github.com/netdata/go.d.plugin/pull/368) by [yasharne](https://github.com/yasharne) - Removed an automatic restart of `apps.plugin`. ([\#8592](https://github.com/netdata/netdata/pull/8592)) by [vlvkobal](https://github.com/vlvkobal) ### Packaging/installation - Added missing `NETDATA_STOP_CMD` in `netdata-installer.sh`. ([\#8897](https://github.com/netdata/netdata/pull/8897)) by [prologic](https://github.com/prologic) - Added JSON-C dependency handling to installation and packaging. ([\#8776](https://github.com/netdata/netdata/pull/8776)) by [Ferroin](https://github.com/Ferroin) - Added a check to wait for a recently-published tag to appear in Docker Hub before publishing new images. ([\#8713](https://github.com/netdata/netdata/pull/8713)) by [knatsakis](https://github.com/knatsakis) - Removed obsolete scripts from Docker images. ([\#8704](https://github.com/netdata/netdata/pull/8704)) by [knatsakis](https://github.com/knatsakis) - Removed obsolete DEVEL support from Docker images. ([\#8702](https://github.com/netdata/netdata/pull/8702)) by [knatsakis](https://github.com/knatsakis) - Improved how we publish Docker images by pushing synchronously. ([\#8701](https://github.com/netdata/netdata/pull/8701)) by [knatsakis](https://github.com/knatsakis) ### Exporting - Enabled internal statistics for the exporting engine in the Agent dashboard. ([\#8635](https://github.com/netdata/netdata/pull/8635)) by [vlvkobal](https://github.com/vlvkobal) - Implemented a Prometheus exporter web API endpoint. ([\#8540](https://github.com/netdata/netdata/pull/8540)) by [vlvkobal](https://github.com/vlvkobal) ### Notifications - Added a certificate revocation alarm for the x509check collector. ([\#8684](https://github.com/netdata/netdata/pull/8684)) by [yasharne](https://github.com/yasharne) - Added the ability to send Agent alarm notifications to Dynatrace. ([\#8476](https://github.com/netdata/netdata/pull/8476)) by [illumine](https://github.com/illumine) ### CI/CD - Disabled `document-start` yamllint check. ([\#8522](https://github.com/netdata/netdata/pull/8522)) by [ilyam8](https://github.com/ilyam8) - Simplified Docker build/publish scripts to support only a single architecture. ([\#8747](https://github.com/netdata/netdata/pull/8747)) by [knatsakis](https://github.com/knatsakis) - Added Fedora 32 to build checks. ([\#8417](https://github.com/netdata/netdata/pull/8417)) by [Ferroin](https://github.com/Ferroin) - Added libffi to ArchLinux CI tests as a workaround for an upstream bug. ([\#8476](https://github.com/netdata/netdata/pull/8476)) by [Ferroin](https://github.com/Ferroin) ### Other - Updated main copyright and links for the year 2020 in daemon help output. ([\#8937](https://github.com/netdata/netdata/pull/8937)) by [zack-shoylev](https://github.com/zack-shoylev) - Moved `bind to` to `[web]` section and update `netdata.service.v235.in` to sync it with recent changes. ([\#8454](https://github.com/netdata/netdata/pull/8454)) by [amishmm](https://github.com/amishmm) - Put old dashboard behind a prefix instead of using a script to switch. ([\#8754](https://github.com/netdata/netdata/pull/8754)) by [Ferroin](https://github.com/Ferroin) - Enabled the truthy rule in yamllint. ([\#8698](https://github.com/netdata/netdata/pull/8698)) by [ilyam8](https://github.com/ilyam8) - Added Borg backup, Squeezebox servers, Hiawatha web server, and Microsoft SQL to apps.plugin so that it can appropriately group them by type of service. ([\#8646](https://github.com/netdata/netdata/pull/8646)), ([\#8655](https://github.com/netdata/netdata/pull/8655)), ([\#8656](https://github.com/netdata/netdata/pull/8656)), and ([\#8659](https://github.com/netdata/netdata/pull/8659)) by [vlvkobal](https://github.com/thenktor) ## Documentation - Add custom label to collectors frontmatter to fix sidebar titles in generated docs site at `learn.netdata.cloud`. ([\#8936](https://github.com/netdata/netdata/pull/8936)) by [joelhans](https://github.com/joelhans) - Added instructions to persist metrics and restart policy in Docker installations. ([\#8813](https://github.com/netdata/netdata/pull/8813)) by [joelhans](https://github.com/joelhans) - Fixed modifier in Nginx guide to ensure correct paths and filenames. ([\#8880](https://github.com/netdata/netdata/pull/8880)) by [slavaGanzin](https://github.com/slavaGanzin) - Added documentation for working around Clang build errors. ([\#8867](https://github.com/netdata/netdata/pull/8867)) by [Ferroin](https://github.com/Ferroin) - Fixed typo in Docker installation instructions. ([\#8861](https://github.com/netdata/netdata/pull/8861)) by [carehart](https://github.com/carehart) - Added Docker instructions to claiming docs. ([\#8755](https://github.com/netdata/netdata/pull/8755)) by [joelhans](https://github.com/joelhans) - Capitalized title in streaming doc. ([\#8712](https://github.com/netdata/netdata/pull/8712)) by [zack-shoylev](https://github.com/zack-shoylev) - Updated pfSense doc and added warning for apcupsd users. ([\#8686](https://github.com/netdata/netdata/pull/8686)) by [cryptoluks](https://github.com/cryptoluks) - Improved offline installation instructions to point to correct installation scripts and clarify process. ([\#8680](https://github.com/netdata/netdata/pull/8680)) by [IceCodeNew](https://github.com/IceCodeNew) - Added missing path to the process of editing `charts.d.conf`. ([\#8740](https://github.com/netdata/netdata/pull/8740)) by [Jiab77](https://github.com/Jiab77) - Added combined claiming and ACLK documentation. ([\#8724](https://github.com/netdata/netdata/pull/8724)) by [joelhans](https://github.com/joelhans) - Standardized how we link between various Agent-specific documentation. ([\#8638](https://github.com/netdata/netdata/pull/8638)) by [joelhans](https://github.com/joelhans) - Pinned `mkdocs-material` to re-enable Netlify builds of documentation site. ([\#8639](https://github.com/netdata/netdata/pull/8639)) by [joelhans](https://github.com/joelhans) - Updated main `README.md` with v1.21 release news. ([\#8619](https://github.com/netdata/netdata/pull/8619)) by [joelhans](https://github.com/joelhans) - Changed references of **MacOS** to **macOS**. ([\#8562](https://github.com/netdata/netdata/pull/8562)) by [joelhans](https://github.com/joelhans) ## Bug fixes - Fixed kickstart error by removing old `cron` symlink. ([\#8849](https://github.com/netdata/netdata/pull/8849)) by [prologic](https://github.com/prologic) - Fixed bundling of old dashboard in binary packages. ([\#8844](https://github.com/netdata/netdata/pull/8844)) by [Ferroin](https://github.com/Ferroin) - Fixed typo in `netdata-installer.sh`. ([\#8811](https://github.com/netdata/netdata/pull/8811)) by [adamwolf](https://github.com/adamwolf) - Fixed failure output during installations by removing old function call. ([\#8824](https://github.com/netdata/netdata/pull/8824)) by [Ferroin](https://github.com/Ferroin) - Fixed `bundle-dashboard.sh` script to prevent broken package builds. ([\#8823](https://github.com/netdata/netdata/pull/8823)) by [prologic](https://github.com/prologic) - Fixed mdstat `failed devices` alarm. ([\#8752](https://github.com/netdata/netdata/pull/8752)) by [ilyam8](https://github.com/ilyam8) - Fixed rare race condition in old Cloud iframe. ([\#8786](https://github.com/netdata/netdata/pull/8786)) by [jacekkolasa](https://github.com/jacekkolasa) - Removed `no-clear-notification` options from portcheck health templates. ([\#8748](https://github.com/netdata/netdata/pull/8748)) by [ilyam8](https://github.com/ilyam8) - Fixed issue in `system-info.sh`regarding the parsing of `lscpu` output. ([\#8754](https://github.com/netdata/netdata/pull/8754)) by [Ferroin](https://github.com/Ferroin) - Fixed old URLs to silence Netlify's mixed content warnings. ([\#8759](https://github.com/netdata/netdata/pull/8759)) by [knatsakis](https://github.com/knatsakis) - Fixed master streaming fatal exits. ([\#8780](https://github.com/netdata/netdata/pull/8780)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed email authentiation to Cloud/Nodes View. ([\#8757](https://github.com/netdata/netdata/pull/8757)) by [jacekkolasa](https://github.com/jacekkolasa) - Fixed non-escaped characters in private registry URLs. ([\#8757](https://github.com/netdata/netdata/pull/8757)) by [jacekkolasa](https://github.com/jacekkolasa) - Fixed crash when shutting down an Agent with the ACLK disabled. ([\#8725](https://github.com/netdata/netdata/pull/8725)) by [lassebm](https://github.com/lassebm) - Fixed Docker-based builder image. ([\#8718](https://github.com/netdata/netdata/pull/8718)) by [ilyam8](https://github.com/ilyam8) - Fixed status checks for UPS devices using the apcupsd collector. ([\#8688](https://github.com/netdata/netdata/pull/8688)) by [ilyam8](https://github.com/ilyam8) - Fixed the build matrix in the build and install GitHub Actions checks. ([\#8715](https://github.com/netdata/netdata/pull/8715)) by [Ferroin](https://github.com/Ferroin) - Fixed eBPF collector compatibility with the 7.x family of RedHat. ([\#8694](https://github.com/netdata/netdata/pull/8694)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed alarm notification script by adding a check to the Dynatrace notification method. ([\#8654](https://github.com/netdata/netdata/pull/8654)) by [ilyam8](https://github.com/ilyam8) - Fixed `threads_creation_rate` chart context in the python.d MySQL collector. ([\#8636](https://github.com/netdata/netdata/pull/8636)) by [ilyam8](https://github.com/ilyam8) - Fixed errors shown when running `install-requred-packages.sh` on certain Linux systems. ([\#8606](https://github.com/netdata/netdata/pull/8606)) by [ilyam8](https://github.com/ilyam8) - Fixed `sudo` check in charts.d libreswan collector to prevent daily security notices. ([\#8569](https://github.com/netdata/netdata/pull/8569)) by [ilyam8](https://github.com/ilyam8) 2020-05-11T13:14:37+00:00 netdata v1.22.1 netdata v1.22.1 2020-05-12T19:34:48+00:00 # Netdata v1.22.1 Release v1.22.1 is a hotfix release to address issues related to packaging and how Agents connect to Netdata Cloud. With packaging, we fixed an error that caused DEB and RPM packages to only display the old dashboard and not the new React version. We also fixed an issue that caused Netdata Docker containers to fail due to incorrect permissions. Finally, we ensured JSON-C is correctly fetched and built for compatibility with Netdata Cloud. We appreciate our community's help in identifying and diagnosing these issues so we could fix them quickly. For Netdata Cloud, we optimized the on-connect payload sent through the Agent-Cloud link to improve latency between Agents and Cloud. We also removed a check for old alarm status when sending alarms to Cloud via the ACLK. Finally, we made a fix that ensures Agents running on systems using the musl C library can receive auto-updates. ## Bug fixes - Fixed the latency issue on the ACLK and suppress the diagnostics. ([\#8992](https://github.com/netdata/netdata/pull/8992)) by [amoss](https://github.com/amoss) and [stelfrag](https://github.com/stelfrag) - Restored old semantics of "netdata -W set" command. ([\#8987](https://github.com/netdata/netdata/pull/8987)) by [mfundul](https://github.com/mfundul) - Added JSON-C packaging fils to make dist. ([\#8986](https://github.com/netdata/netdata/pull/8986)) by [Ferroin](https://github.com/Ferroin) - Fixed bundling of React dashboard in DEB and RPM packages. ([\#8988](https://github.com/netdata/netdata/pull/8988)) by [Ferroin](https://github.com/Ferroin) - Removed check for old alarm status. ([\#8978](https://github.com/netdata/netdata/pull/8978)) by [stelfrag](https://github.com/stelfrag) - Fixed shutdown via netdatacli with musl C library. ([\#8931](https://github.com/netdata/netdata/pull/8931)) by [mfundul](https://github.com/mfundul) 2020-05-12T19:34:48+00:00 netdata v1.23.0 netdata v1.23.0 2020-06-25T02:53:38+00:00 # Release v1.23.0 The v1.23.0 release of the Netdata Agent is all about unlocking new depths of visibility for your applications, services, and systems. We have Kubernetes service discovery, new eBPF metrics like virtual filesystem switch and bandwidth per process out of the Linux kernel at _event frequency_, more interoperability with your monitoring stack thanks to a new exporting engine, and much more. This release contains 2 new collectors, 1 new exporting connector, 1 new alarm notification method, 55 improvements, 45 documentation updates, and 40 bug fixes. ## At a glance Our [service discovery collector](https://github.com/netdata/agent-service-discovery/) **detects Kubernetes (k8s) pods and immediately collects metrics from _22 different services_** as the associated pods are created, destroyed, and scaled. Service discovery is installed when you use our [Helm chart](https://github.com/netdata/helmchart), which means you can now collect and visualize service-, pod-, Kubelet-, kube-proxy-, and node-level k8s metrics with one `helm install` command and zero configuration. All our Kubernetes monitoring components are open source and free for clusters of any size. Our low-level [Linux kernel monitoring via eBPF](https://learn.netdata.cloud/docs/agent/collectors/ebpf.plugin/) is now supercharged. Thanks to an integration with [`apps.plugin`](https://learn.netdata.cloud/docs/agent/collectors/apps.plugin), you can now **monitor how a specific application interacts with the Linux kernel**. This update also includes new metrics, such as virtual filesystem switch, bandwidth per process, and much more. Netdata collects these metrics at an event frequency, even better than our famous 1s granularity, so that you can debug applications or anomalies with pinpoint accuracy. The eBPF collector is also now installed and enabled by default except on [static builds](https://learn.netdata.cloud/docs/agent/packaging/installer/methods/kickstart-64). Read our [guide on troubleshooting apps with eBPF metrics](https://learn.netdata.cloud/guides/troubleshoot/monitor-debug-applications-ebpf/) for more details. Netdata is now more interoperable with your existing monitoring stack thanks to the [**exporting engine**](https://learn.netdata.cloud/docs/agent/exporting/), which replaces the backends system. You can now export to multiple external databases through Graphite, Google Cloud Pub/Sub, Prometheus remote write, MongoDB, and JSON connectors, plus others. Send metrics as soon as they're collected to enrich single pane of glass views or analyze Netdata's metrics with machine learning. Read our guide on [exporting metrics to Graphite](https://learn.netdata.cloud/guides/export/export-netdata-metrics-graphite) for specifics on just one of many pipelines you can set up to archive your Netdata metrics. We're also releasing an improvement for the availability of your monitoring and metrics: **persistent metadata**. The Agent now writes metadata to disk alongside metrics to allow access to non-active charts from Netdata Cloud and enable future features. We added some enhancements to our documentation site, including a new [guides section](https://learn.netdata.cloud/guides). We'll continue to populate with more use case- and scenario-based content to help you monitor, troubleshoot, visualize, and export your Netdata metrics. ## Acknowledgments - [okias](https://github.com/okias) for adding support for Matrix notifications. - [elelayan](https://github.com/elelayan) for adding an OSD size collection chart to the Ceph collector. - [vsc55](https://github.com/vsc55) for fixing the required packages for Gentoo builds. - [rushikeshjadhav](https://github.com/rushikeshjadhav) for fixing the Xenstat collector to correctly track the last number of vCPUs. - [Saruspete](https://github.com/Saruspete) for removing conflicting EPEL packages. - [MrFreezeex](https://github.com/MrFreezeex) for fixing suid bits in Debian packaging. - [Neamar](https://github.com/Neamar) for fixing a typo in the dashboard's description of the `mem.kernel` chart. - [jeffgdotorg](https://github.com/jeffgdotorg) for fixing incorrectly formatted TYPE lines in the Prometheus backend/exporter. - [tnyeanderson](https://github.com/tnyeanderson) for continuing to improve his `dash.html` custom dashboard. - [dpsy4](https://github.com/dpsy4) for fixing our Swagger API file. - [araemo](https://github.com/araemo) for fixing alarms around RAM usage in ZFS systems. - [slavaGanzin](https://github.com/slavaGanzin) for implementing a fix to the PostgreSQL collector. - [pkrasam](https://github.com/pkrasam), [thoggs](https://github.com/thoggs), [oneoneonepig](https://github.com/oneoneonepig), [Steve8291](https://github.com/Steve8291), [stephenrauch](https://github.com/stephenrauch), [waybeforenow](https://github.com/waybeforenow), [zvarnes](https://github.com/zvarnes), [electropup42](https://github.com/electropup42), [cherouvim](https://github.com/cherouvim), [thenktor](https://github.com/thenktor), [webash](https://github.com/webash) and [gruentee](https://github.com/gruentee) for contributing documentation changes. ## Improvements - Added libuv thread names support to FATAL log level. ([\#9382](https://github.com/netdata/netdata/pull/9382)) by [mfundul](https://github.com/mfundul) - Updated the React dashboard to v1.0.14_2. ([\#9350](https://github.com/netdata/netdata/pull/9350)) by [jacekkolasa](https://github.com/jacekkolasa) - Improved PR guidelines for developers and contributors. ([\#8809](https://github.com/netdata/netdata/pull/8809)) by [prologic](https://github.com/prologic) - Removed master-slave verbiage and replaced it with parent-child. ([\#9323](https://github.com/netdata/netdata/pull/9323)) by [amoss](https://github.com/amoss), ([\#9312](https://github.com/netdata/netdata/pull/9312)) by [joelhans](https://github.com/joelhans) - Added support for persistent metadata. ([\#9324](https://github.com/netdata/netdata/pull/9324)) by [stelfrag](https://github.com/stelfrag) - Add verbose prints when spawn server fails to spawn. ([\#9305](https://github.com/netdata/netdata/pull/9305)) by [mfundul](https://github.com/mfundul) - Updated streaming protocol calculate clock-slew and gap-size when child nodes reconnect to a parent. ([\#9214](https://github.com/netdata/netdata/pull/9214)) by [amoss](https://github.com/amoss) - Implemented a new incremental parser for internal plugins and child nodes. ([\#9074](https://github.com/netdata/netdata/pull/9074)) by [stelfrag](https://github.com/stelfrag) - Improved database engine by reducing its minimum size to 64 MiB. ([\#9094](https://github.com/netdata/netdata/pull/9094)) by [mfundul](https://github.com/mfundul) - Added alphabetical sort and automatic scroll to `dash.html`. ([\#8762](https://github.com/netdata/netdata/pull/8762)) by [tnyeanderson](https://github.com/tnyeanderson) - Added a spawn server to improved Agent scalability by reducing the impact of alarm execution and notification to critical sections in the main health thread. ([\#8407](https://github.com/netdata/netdata/pull/8407)) by [mfundul](https://github.com/mfundul) ### Netdata Cloud - Added metrics for ACLK performance and status to the **Netdata Monitoring** section of the dashboard. ([\#9269](https://github.com/netdata/netdata/pull/9269)) by [underhood](https://github.com/underhood) - Improved the node re-claiming process by regenerating the topic base. ([\#9044](https://github.com/netdata/netdata/pull/9044)) by [amoss](https://github.com/amoss) ### Collectors - Updated the Go orchestrator to v0.19.2. ([\#9340](https://github.com/netdata/netdata/pull/9340)) by [ilyam8](https://github.com/ilyam8) - Added the `agent-service-discovery` collector plugin to `apps_group.conf`. ([\#9315](https://github.com/netdata/netdata/pull/9315)) by [ilyam8](https://github.com/ilyam8) - Improved consistency of Kubernetes cgroup names. ([\#9303](https://github.com/netdata/netdata/pull/9303)) by [cakrit](https://github.com/cakrit) - Updated the Go orchestrator to v0.19.1. ([\#9309](https://github.com/netdata/netdata/pull/9309)) by [ilyam8](https://github.com/ilyam8) - Added imunify and lsphp to `apps_groups.conf`. ([\#9284](https://github.com/netdata/netdata/pull/9284)) by [thiagoftsm](https://github.com/thiagoftsm) - Updated the Go orchestrator to v0.19.0. ([\#9294](https://github.com/netdata/netdata/pull/9294)) by [ilyam8](https://github.com/ilyam8) - Added support for the eBPF collector in static installations (`kickstart-static64.sh`). ([\#8879](https://github.com/netdata/netdata/pull/8879)) by [prologic](https://github.com/prologic) - Updated the eBPF kernel-collector to v0.4.0. See [the changelog](https://github.com/netdata/kernel-collector/releases/tag/v0.4.0) for details. ([\#9212](https://github.com/netdata/netdata/pull/9212)) by [Ferroin](https://github.com/Ferroin) - Added integration between `ebpf.plugin` and `apps.plugin`. ([\#9178](https://github.com/netdata/netdata/pull/9178)) by [thiagoftsm](https://github.com/thiagoftsm) - Converted the eBPF collector into a modular design to allow multiple eBPF programs to run in parallel. ([\#9148](https://github.com/netdata/netdata/pull/9148)) by [thiagoftsm](https://github.com/thiagoftsm) - Added an OSD size collection chart to the Ceph collector. ([\#8649](https://github.com/netdata/netdata/pull/8649)) by [elelayan](https://github.com/elelayan) - Updated the eBPF kernel-collector to v0.2.0. See [the changelog](https://github.com/netdata/kernel-collector/releases/tag/v0.2.0) for details. ([\#9118](https://github.com/netdata/netdata/pull/9118)) by [prologic](https://github.com/prologic) - Improved `system-info.sh` to better handle certain cases when gathering info on the system's disk capacity. ([\#7902](https://github.com/netdata/netdata/pull/7902)) by [Ferroin](https://github.com/Ferroin) - Changed the eBPF collector to install and enable it by default. ([\#8665](https://github.com/netdata/netdata/pull/8665)) by [Ferroin](https://github.com/Ferroin) - Enhanced the Samba collector to only use `sudo` when not running as the root user. ([\#9038](https://github.com/netdata/netdata/pull/9038)) by [Duffyx](https://github.com/Duffyx) - Renamed the eBPF collector from `ebpf_process.plugin` to `ebpf.plugin`. ([\#8822](https://github.com/netdata/netdata/pull/8822)) by [thiagoftsm](https://github.com/thiagoftsm) - Added more command line options to the eBPF collector to support upcoming features. ([\#8879](https://github.com/netdata/netdata/pull/8879)) by [thiagoftsm](https://github.com/thiagoftsm) - Added compatibility for Varnish Cache Plus in the `varnish` collector. ([\#8940](https://github.com/netdata/netdata/pull/8940)) by [pgjavier](https://github.com/pgjavier) ### Packaging/installation - Added new streaming files into CMake build. ([\#9316](https://github.com/netdata/netdata/pull/9316)) by [underhood](https://github.com/underhood) - Added support for macOS/Homebrew in `install-required-packages.sh`. ([\#8286](https://github.com/netdata/netdata/pull/8286)) by [Ferroin](https://github.com/Ferroin) - Improved reliability of checksums for `kickstart.sh`/`kickstart-static64.sh` installation scripts. ([\#9165](https://github.com/netdata/netdata/pull/9165)) by [prologic](https://github.com/prologic) - Added required bundle for libuuid on ClearLinux. ([\#9060](https://github.com/netdata/netdata/pull/9060)) by [Ferroin](https://github.com/Ferroin) - Removed conflicting EPEL packages. ([\#9108](https://github.com/netdata/netdata/pull/9108)) by [Saruspete](https://github.com/Saruspete) ### Exporting - Moved `nc` backend to exporting. ([\#9030](https://github.com/netdata/netdata/pull/9030)) by [thiagoftsm](https://github.com/thiagoftsm) - Added missing checks to exporting engine. ([\#9034](https://github.com/netdata/netdata/pull/9034)) by [thiagoftsm](https://github.com/thiagoftsm) - Added new alarms for exporting engine resource usage and deprecation of backends. ([\#9075](https://github.com/netdata/netdata/pull/9075)) by [thiagoftsm](https://github.com/thiagoftsm) - Added an error report to the AWS Kinesis connector. ([\#9048](https://github.com/netdata/netdata/pull/9048)) by [thiagoftsm](https://github.com/thiagoftsm) - Added memory cleanup to remaining exporting connectors. ([\#9098](https://github.com/netdata/netdata/pull/9098)) by [thiagoftsm](https://github.com/thiagoftsm) - Added a warning if the exporting engine's update interval is not a multiple of the database's update interval. ([\#9131](https://github.com/netdata/netdata/pull/9131)) by [vlvkobal](https://github.com/vlvkobal) - Added anonymous statistics to exporting engine to collect usage data. ([\#9125](https://github.com/netdata/netdata/pull/9125)) by [vlvkobal](https://github.com/vlvkobal) - Improved dynamic memory cleanup for Pub/Sub exporting connector. ([\#9112](https://github.com/netdata/netdata/pull/9112)) by [vlvkobal](https://github.com/vlvkobal) - Improved dynamic memory cleanup for the MongoDB exporting connector. ([\#9103](https://github.com/netdata/netdata/pull/9103)) by [vlvkobal](https://github.com/vlvkobal) - Finalized the main cleanup function for the exporting engine. ([\#9099](https://github.com/netdata/netdata/pull/9099)) by [vlvkobal](https://github.com/vlvkobal) - Added a function to help clean up memory on exit. ([\#9081](https://github.com/netdata/netdata/pull/9081)) by [vlvkobal](https://github.com/vlvkobal) - Added a Google Cloud Pub/Sub connector to the exporting engine. ([\#8855](https://github.com/netdata/netdata/pull/8855)) by [vlvkobal](https://github.com/vlvkobal) ### Notifications - Added support for Matrix notifications. ([\#9196](https://github.com/netdata/netdata/pull/9196)) by [okias](https://github.com/okias) ### CI/CD - Removed Gentoo from CI checks. ([\#9327](https://github.com/netdata/netdata/pull/9327)) by [prologic](https://github.com/prologic) - Added a random offset to the update script when running non-interactively. ([\#9245](https://github.com/netdata/netdata/pull/9245)) by [Ferroin](https://github.com/Ferroin) - Added a CI check for building against LibreSSL. ([\#9216](https://github.com/netdata/netdata/pull/9216)) by [prologic](https://github.com/prologic) - Added a health check functionality to Docker images. ([\#9172](https://github.com/netdata/netdata/pull/9172)) by [Ferroin](https://github.com/Ferroin) - Added CI for static builds of the Netdata Agent (used by `kickstart-static64.sh`). ([\#9130](https://github.com/netdata/netdata/pull/9130)) by [prologic](https://github.com/prologic) - Removed deprecated documentation Dockerfile and associated Docker Hub image. ([\#9126](https://github.com/netdata/netdata/pull/9126)) by [prologic](https://github.com/prologic) - Removed deprecated documentation tooling. ([\#8783](https://github.com/netdata/netdata/pull/8783)) by [prologic](https://github.com/prologic) - Added a CI job to check Markdown links during PRs. ([\#9003](https://github.com/netdata/netdata/pull/9003)) by [joelhans](https://github.com/joelhans) - Removed Polyverse Polymorphic Linux from Docker builds to reduce the image size. ([\#8802](https://github.com/netdata/netdata/pull/8802)) by [Ferroin](https://github.com/Ferroin) ## Documentation - Fixed a typo in the Synology installation documentation. ([\#9400](https://github.com/netdata/netdata/pull/9400)) by [pkrasam](https://github.com/pkrasam) - Added a guide for troubleshooting with eBPF metrics. ([\#9352](https://github.com/netdata/netdata/pull/9352)) by [joelhans](https://github.com/joelhans) - Improved the FreeBSD installation documentation. ([\#9116](https://github.com/netdata/netdata/pull/9116)) by [thoggs](https://github.com/thoggs) - Added a missing slash to the claiming documentation. ([\#9257](https://github.com/netdata/netdata/pull/9257)) by [oneoneonepig](https://github.com/oneoneonepig) - Changed the recommended repository for CentOS 8 users. ([\#9308](https://github.com/netdata/netdata/pull/9308)) by [Ferroin](https://github.com/Ferroin) - Added a guide for exporting metrics to Graphite. ([\#9285](https://github.com/netdata/netdata/pull/9285)) by [joelhans](https://github.com/joelhans) - Added a link in the eBPF documentation to the kernel documentation for ftrace. ([\#9211](https://github.com/netdata/netdata/pull/9211)) by [Steve8291](https://github.com/Steve8291) - Fixed curly to straight apostrophe. ([\#8723](https://github.com/netdata/netdata/pull/8723)) by [zack-shoylev](https://github.com/zack-shoylev) - Added documentation and dashboard information for new eBPF-apps.plugin integration. ([\#9199](https://github.com/netdata/netdata/pull/9199)) by [thiagoftsm](https://github.com/thiagoftsm) - Moved and refactored docs to accomodate new Guides section on Learn. ([\#9266](https://github.com/netdata/netdata/pull/9266)) by [joelhans](https://github.com/joelhans) - Removed outdated information/links from main README and registry doc. ([\#9265](https://github.com/netdata/netdata/pull/9265)) by [joelhans](https://github.com/joelhans) - Added notes/known issues section to installation page. ([\#9053](https://github.com/netdata/netdata/pull/9053)) by [joelhans](https://github.com/joelhans) - Fixed ambiguity in health reference for `of` and `foreach` options in lookup line. ([\#9255](https://github.com/netdata/netdata/pull/9255)) by [underhood](https://github.com/underhood) - Added a new "home base" document for the exporting engine. ([\#9246](https://github.com/netdata/netdata/pull/9246)) by [joelhans](https://github.com/joelhans) - Improved database engine documentation for streaming setups. ([\#9177](https://github.com/netdata/netdata/pull/9177)) by [joelhans](https://github.com/joelhans) - Fixed typo in eBPF collector `README.md`. ([\#9205](https://github.com/netdata/netdata/pull/9205)) by [Steve8291](https://github.com/Steve8291) - Fixed typo in `README.md`. ([\#9151](https://github.com/netdata/netdata/pull/9151)) by [stephenrauch](https://github.com/stephenrauch) - Removed the "experimental" label from the exporting engine documentation. ([\#9171](https://github.com/netdata/netdata/pull/9171)) by [vlvkobal](https://github.com/vlvkobal) - Fixed typo in step 3 of step-by-step guide. ([\#9150](https://github.com/netdata/netdata/pull/9150)) by [waybeforenow](https://github.com/waybeforenow) - Added a Certbot troubleshooting section to step 10 of the step-by-step guide. ([\#9000](https://github.com/netdata/netdata/pull/9000)) by [Jelmerrevers](https://github.com/Jelmerrevers) - Updated eBPF documentation to reflect default enabled status. ([\#9105](https://github.com/netdata/netdata/pull/9105)) by [joelhans](https://github.com/joelhans) - Added ACLK connection details. ([\#9047](https://github.com/netdata/netdata/pull/9047)) by [zack-shoylev](https://github.com/zack-shoylev) - Added CMake to the list of packages to install on FreeBSD installations. ([\#9031](https://github.com/netdata/netdata/pull/9031)) by [zvarnes](https://github.com/zvarnes) - Improved Synology installation document with better formatting and instructions. ([\#8658](https://github.com/netdata/netdata/pull/8658)) by [thenktor](https://github.com/thenktor) - Updated pfSense installation document with new packages and processes. ([\#8544](https://github.com/netdata/netdata/pull/8544)) by [electropup42](https://github.com/electropup42) - Updated documentation contributing guidelines and Netdata style guide. ([\#8781](https://github.com/netdata/netdata/pull/8781)) by [joelhans](https://github.com/joelhans) - Added links to promote database engine calculator. ([\#9067](https://github.com/netdata/netdata/pull/9067)) by [joelhans](https://github.com/joelhans) - Updated exporting engine documentation to prepare for enabling it by default. ([\#9066](https://github.com/netdata/netdata/pull/9066)) by [vlvkobal](https://github.com/vlvkobal) - Added requirements to the ProxySQL collector documentation. ([\#9071](https://github.com/netdata/netdata/pull/9071)) by [ilyam8](https://github.com/ilyam8) - Added proc.plugin configuration example for high-processor systems. ([\#9062](https://github.com/netdata/netdata/pull/9062)) by [joelhans](https://github.com/joelhans) - Added frontmatter for exporting connectors. ([\#9052](https://github.com/netdata/netdata/pull/9052)) by [joelhans](https://github.com/joelhans) - Fixed grammar error in HAProxy documentation. ([\#8703](https://github.com/netdata/netdata/pull/8703)) by [cherouvim](https://github.com/cherouvim) - Updated FreeBSD package installation documentation. ([\#8643](https://github.com/netdata/netdata/pull/8643)) by [thenktor](https://github.com/thenktor) - Fixed `docker run` instruction in claiming document. ([\#9058](https://github.com/netdata/netdata/pull/9058)) by [ilyam8](https://github.com/ilyam8) - Added a note about restarting a node during reclaiming. ([\#9049](https://github.com/netdata/netdata/pull/9049)) by [zack-shoylev](https://github.com/zack-shoylev) - Removed mentions of old Cloud and replaced them with new Cloud/dashboard. ([\#8874](https://github.com/netdata/netdata/pull/8874)) by [joelhans](https://github.com/joelhans) - Fixed broken link in web server log guide on GitHub. ([\#9033](https://github.com/netdata/netdata/pull/9033)) by [joelhans](https://github.com/joelhans) - Removed emoji from step-by-step guide. ([\#8872](https://github.com/netdata/netdata/pull/8872)) by [MeganBishopMoore](https://github.com/MeganBishopMoore) - Added text to claiming documentation about reclaiming. ([\#9027](https://github.com/netdata/netdata/pull/9027)) by [joelhans](https://github.com/joelhans) - Updated daemon output with new URLs and dates. ([\#8965](https://github.com/netdata/netdata/pull/8965)) by [joelhans](https://github.com/joelhans) - Added `netdatalib` and `netdatacache` volumes to the Docker-with-Caddy documentation. ([\#8999](https://github.com/netdata/netdata/pull/8999)) by [webash](https://github.com/webash) - Fixed an incorrect file name in the Go-based web log collector. ([\#8964](https://github.com/netdata/netdata/pull/8964)) by [gruentee](https://github.com/gruentee) - Removed incorrect `UNUSED` from flood protection configuration options documentation. ([\#8964](https://github.com/netdata/netdata/pull/8964)) by [mfundul](https://github.com/mfundul) - Fixed internal links and removed obsolete admonitions. ([\#8946](https://github.com/netdata/netdata/pull/8946)) by [joelhans](https://github.com/joelhans) - Updated docs with go-live claiming and ACLK information. ([\#8960](https://github.com/netdata/netdata/pull/8960)) by [joelhans](https://github.com/joelhans) ## Bug fixes - Fixed a Coverity defect. ([\#9402](https://github.com/netdata/netdata/pull/9402)) by [amoss](https://github.com/amoss) - Fix a bug in the simple exporting connector that caused crashes when both `opentsdb:https` and another connector were enabled together. ([\#9389](https://github.com/netdata/netdata/pull/9389)) by [vlvkobal](https://github.com/vlvkobal) - Fixed missing host variables on stream. ([\#9396](https://github.com/netdata/netdata/pull/9396)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed race-hazard in streaming during the shutdown sequence. ([\#9370](https://github.com/netdata/netdata/pull/9370)) by [amoss](https://github.com/amoss) - Fixed error handling and recovery during compaction and metadata log replay. ([\#9354](https://github.com/netdata/netdata/pull/9354)) by [stelfrag](https://github.com/stelfrag) - Fixed ACLK shutdown sequence. ([\#9367](https://github.com/netdata/netdata/pull/9367)) by [underhood](https://github.com/underhood) - Fixed logging by replacing `assert()` calls with new `fatal_assert()`. ([\#9349](https://github.com/netdata/netdata/pull/9349)) by [mfundul](https://github.com/mfundul) - Fixed issues with CentOS 6 installations by getting Netdata execution path early to avoid user permission issues. ([\#9339](https://github.com/netdata/netdata/pull/9339)) by [mfundul](https://github.com/mfundul) - Fixed issues with ebpf.plugin and apps.plugin integration. ([\#9333](https://github.com/netdata/netdata/pull/9333)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed Coverity warnings in database. ([\#9338](https://github.com/netdata/netdata/pull/9338)) by [mfundul](https://github.com/mfundul) - Fixed compiler warnings from the database when the Agent is compiled with the `--disable-cloud` flag. ([\#9337](https://github.com/netdata/netdata/pull/9337)) by [stelfrag](https://github.com/stelfrag) - Fixed invalid memory access in databases to avoid Coverity errors. ([\#9326](https://github.com/netdata/netdata/pull/9326)) by [stelfrag](https://github.com/stelfrag) - Fixed broken updates to do enabling the eBPF collector by default with a dummy `--enable-ebpf` flag. ([\#9310](https://github.com/netdata/netdata/pull/9310)) by [Ferroin](https://github.com/Ferroin) - Fixed exporting to Cortex by adding an additional HTTP header to the Prometheus remore write connector. ([\#9302](https://github.com/netdata/netdata/pull/9302)) by [vlvkobal](https://github.com/vlvkobal) - Fixed a race hazard causing crashes in streaming configurations. ([\#9297](https://github.com/netdata/netdata/pull/9297)) by [amoss](https://github.com/amoss) - Fixed handling of OpenSSL on CentOS/RHEL by bundling a static copy and selecting a configuration directory at install time. ([\#9263](https://github.com/netdata/netdata/pull/9263)) by [Ferroin](https://github.com/Ferroin) - Fixed static installation from overwriting `netdata.conf`. ([\#9174](https://github.com/netdata/netdata/pull/9174)) by [Ferroin](https://github.com/Ferroin) - Fixed compilation on older systems (Ubuntu 14.04 LTS, Debian 8, CentOS 6). ([\#9198](https://github.com/netdata/netdata/pull/9198)) by [ktsaou](https://github.com/ktsaou) - Fixed broken unit tests for the exporting engine. ([\#9183](https://github.com/netdata/netdata/pull/9183)) by [vlvkobal](https://github.com/vlvkobal) - Fixed an issue with the exporting engine not cleaning a string on exit. ([\#9188](https://github.com/netdata/netdata/pull/9188)) by [vlvkobal](https://github.com/vlvkobal) - Fixed issue with incremental parser breaking CMake builds. ([\#9186](https://github.com/netdata/netdata/pull/9186)) by [stelfrag](https://github.com/stelfrag) - Fixed the eBPF collector failing to install on certain systems. ([\#9182](https://github.com/netdata/netdata/pull/9182)) by [prologic](https://github.com/prologic) - Fixed Coverity warning. ([\#9180](https://github.com/netdata/netdata/pull/9180)) by [thiagoftsm](https://github.com/thiagoftsm) - Fixed required packages for Gentoo builds. ([\#9141](https://github.com/netdata/netdata/pull/9141)) by [vsc55](https://github.com/vsc55) - Fixed Coverity warning. ([\#9157](https://github.com/netdata/netdata/pull/9157)) by [stelfrag](https://github.com/stelfrag) - Fixed broken collector plugins due to bug in parser. ([\#9158](https://github.com/netdata/netdata/pull/9158)) by [stelfrag](https://github.com/stelfrag) - Fixed the Xenstat collector to correctly track the last number of vCPUs. ([\#8720](https://github.com/netdata/netdata/pull/8720)) by [rushikeshjadhav](https://github.com/rushikeshjadhav) - Fixed incorrect link in `install-required-packages.sh` to help users submit a GitHub issue. ([\#8911](https://github.com/netdata/netdata/pull/8911)) by [prologic](https://github.com/prologic) - Fixed enable/start of `netdata` service in Debian package. ([\#9005](https://github.com/netdata/netdata/pull/9005)) by [MrFreezeex](https://github.com/MrFreezeex) - Fixed buffer splitting in the Kinesis exporting connector. ([\#9122](https://github.com/netdata/netdata/pull/9122)) by [vlvkobal](https://github.com/vlvkobal) - Fixed suid bits on plugin for Debian packaging. ([\#8996](https://github.com/netdata/netdata/pull/8996)) by [MrFreezeex](https://github.com/MrFreezeex) - Fixed zombie procesess in Docker image by restoring `SIGCHLD` signal handler. ([\#9107](https://github.com/netdata/netdata/pull/9107)) by [mfundul](https://github.com/mfundul) - Fixed static installation to not overwrite `netdata.conf` when updating. ([\#9046](https://github.com/netdata/netdata/pull/9046)) by [Ferroin](https://github.com/Ferroin) - Fixed typo in the dashboard's description of the `mem.kernel` chart. ([\#9096](https://github.com/netdata/netdata/pull/9096)) by [Neamar](https://github.com/Neamar) - Fixed incorrectly formatted TYPE lines in the Prometheus backend/exporter. ([\#9086](https://github.com/netdata/netdata/pull/9086)) by [jeffgdotorg](https://github.com/jeffgdotorg) - Fixed error handling in the exporting connector. ([\#8910](https://github.com/netdata/netdata/pull/8910)) by [vlvkobal](https://github.com/vlvkobal) - Added a missing bracket to the Netdata API swagger `.json` file. ([\#8814](https://github.com/netdata/netdata/pull/8814)) by [dpsy4](https://github.com/dpsy4) - Fixed the health entity calculation used for `ram_in_use` and `used_ram_to_ignore` in systems using ZFS. ([\#8913](https://github.com/netdata/netdata/pull/8913)) by [araemo](https://github.com/araemo) - Fixed incorrect hostnames in the exporting engine. ([\#8892](https://github.com/netdata/netdata/pull/8892)) by [vlvkobal](https://github.com/vlvkobal) - Fixed an issue with the PostgreSQL collector to correctly ignore template1/template0 databases. ([\#8929](https://github.com/netdata/netdata/pull/8929)) by [slavaGanzin](https://github.com/slavaGanzin) 2020-06-25T02:53:38+00:00 netdata v1.23.1 netdata v1.23.1 2020-07-01T13:54:41+00:00 # Netdata v1.23.1 Release v1.23.1 of the Netdata Agent is a patch for two significant issues. PR [#9436](https://github.com/netdata/netdata/pull/9436) fixed an issue where dimensions were marked obsolete and archived simultaneously, which caused segmentation faults. We're grateful to [marioem](https://github.com/marioem), who first reported the issue, and other members of the Netdata community who contributed their insights and valuable log information, which we used to diagnose and fix the bug. PR [#9428] fixed a significant issue with duplicate alarm IDs, which caused issues in how alarms were sent and displayed in Netdata Cloud. This release also contains a few additional bug fixes that were not fully reviewed before the release of v1.23.0. ## Bug fixes - Disallow dimensions and chart being obsolete and archived simultaneously. ([#9436](https://github.com/netdata/netdata/pull/9436), [@mfundul](https://github.com/mfundul)) - Fix duplicate alarm ids in health-log.db ([#9428](https://github.com/netdata/netdata/pull/9428), [@stelfrag](https://github.com/stelfrag)) - Show cgroups/containers ran by Kubelet without access to Kubernetes cluster information ([#9321](https://github.com/netdata/netdata/pull/9321), [@cakrit](https://github.com/cakrit)) - Fix children version on stream ([#9438](https://github.com/netdata/netdata/pull/9438), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix internal registry ([#9434](https://github.com/netdata/netdata/pull/9434), [@thiagoftsm](https://github.com/thiagoftsm)) - Correct virtualization detection in system-info.sh ([#9425](https://github.com/netdata/netdata/pull/9425), [@Ferroin](https://github.com/Ferroin)) - Fix the unittest execution ([#9445](https://github.com/netdata/netdata/pull/9445), [@thiagoftsm](https://github.com/thiagoftsm)) - Update description in registry with minor copy edits ([#9441](https://github.com/netdata/netdata/pull/9441), [@amoss](https://github.com/amoss)) - Stop reading from /proc/sys/kernel/osrelease at trailing newline ([#9374](https://github.com/netdata/netdata/pull/9374), [@sjuxax](https://github.com/sjuxax)) 2020-07-01T13:54:41+00:00 netdata v1.23.2 netdata v1.23.2 2020-07-16T11:31:40+00:00 # Netdata v1.23.2 Release v1.23.2 of the Netdata Agent is a patch for one significant issue. PR [#9491](https://github.com/netdata/netdata/pull/9491) fixed a buffer overrun vulnerability in Netdata's JSON parsing code. This vulnerability could be used to crash Agents remotely, and in some circumstances, could be used in an arbitrary code execution (ACE) exploit. _We strongly encourage all Netdata users to update their nodes to v1.23.2 as soon as possible._ This release also contains additional bug fixes and improvements. ## Acknowledgements - [@Saruspete](https://github.com/Saruspete) for adding Infiniband monitoring to Netdata! - [@meesaltena](https://github.com/meesaltena) for fixing a typo in `netdata-installer.sh`. - [@anirudhdggl](https://github.com/anirudhdggl) for tweaking the PyMySQL library to respect the `my.cnf` parameter when monitoring MySQL. - [@candrews](https://github.com/candrews) for cleaning up the exporting engine by wrapping header definitions in compilation conditions. - [@RubenKelevra](https://github.com/RubenKelevra) for deploying an update to the IPFS collector that makes it compatible with IPFS v0.5.0+. - [@vsc55](https://github.com/vsc55) for adding support for returning headers using python.d's UrlService. ## Improvements - Add support for multiple ACLK query processing threads ([#9355](https://github.com/netdata/netdata/pull/9355), [@underhood](https://github.com/underhood)) - Add Infiniband monitoring to collector proc.plugin ([#9091](https://github.com/netdata/netdata/pull/9091), [@Saruspete](https://github.com/Saruspete)) - Change the HTTP method to make the IPFS collector compatible with 0.5.0+ ([#9248](https://github.com/netdata/netdata/pull/9248), [@RubenKelevra](https://github.com/RubenKelevra)) - Add support for returning headers using python.d's UrlService ([#9236](https://github.com/netdata/netdata/pull/9236), [@vsc55](https://github.com/vsc55)) ## Documentation - Fix broken link in Kavenegar notification doc ([#9492](https://github.com/netdata/netdata/pull/9492), [@joelhans](https://github.com/joelhans)) - Add documentation for installing Netdata on k8s clusters ([#9364](https://github.com/netdata/netdata/pull/9364), [@joelhans](https://github.com/joelhans)) - Add notices to packaging docs for access errors and Cloud dependencies ([#9422](https://github.com/netdata/netdata/pull/9422), [@joelhans](https://github.com/joelhans)) - Fix broken link to Polyverse in Docker documentation ([#9426](https://github.com/netdata/netdata/pull/9426), [@joelhans](https://github.com/joelhans)) - Add notice to eBPF documentation about incompatibility with static builds ([#9418](https://github.com/netdata/netdata/pull/9418), [@joelhans](https://github.com/joelhans)) ## Packaging / installation - Properly include eBPF collector in binary packages. ([#9450](https://github.com/netdata/netdata/pull/9450), [@Ferroin](https://github.com/Ferroin)) - Fix typo in netdata-installer.sh ([#9433](https://github.com/netdata/netdata/pull/9433), [@meesaltena](https://github.com/meesaltena)) - Fix broken link to Polyverse in Docker documentation ([#9426](https://github.com/netdata/netdata/pull/9426), [@joelhans](https://github.com/joelhans)) - Add first class support for FreeBSD ([#9413](https://github.com/netdata/netdata/pull/9413), [@prologic](https://github.com/prologic)) ## CI/CD - Disable CentOS 8.x CI (temporarily) ([#9538](https://github.com/netdata/netdata/pull/9538), [@prologic](https://github.com/prologic)) - Remove Fedora 30 from CI ([#9274](https://github.com/netdata/netdata/pull/9274), [@Ferroin](https://github.com/Ferroin)) ## Bug fixes - Fix vulnerability in JSON parsing ([#9491](https://github.com/netdata/netdata/pull/9491), [@underhood](https://github.com/underhood)) - Fixed stored number accuracy ([#9540](https://github.com/netdata/netdata/pull/9540), [@stelfrag](https://github.com/stelfrag)) - Fix transition from archived to active charts not generating alarms ([#9536](https://github.com/netdata/netdata/pull/9536), [@mfundul](https://github.com/mfundul)) - Fix PyMySQL library to respect `my.cnf` parameter ([#9526](https://github.com/netdata/netdata/pull/9526), [@anirudhdggl](https://github.com/anirudhdggl)) - Remove health from archived metrics ([#9520](https://github.com/netdata/netdata/pull/9520), [@mfundul](https://github.com/mfundul)) - Update exporting engine to read the prefix option from instance config sections ([#9463](https://github.com/netdata/netdata/pull/9463), [@vlvkobal](https://github.com/vlvkobal)) - Fix display error in Swagger API documentation ([#9417](https://github.com/netdata/netdata/pull/9417), [@underhood](https://github.com/underhood)) - Wrap exporting engine header definitions in compilation conditions ([#9458](https://github.com/netdata/netdata/pull/9458), [@candrews](https://github.com/candrews)) - Improve cgroups collector to autodetect unified cgroups ([#9249](https://github.com/netdata/netdata/pull/9249), [@underhood](https://github.com/underhood)) - Fix CMake build failing if ACLK is disabled ([#9537](https://github.com/netdata/netdata/pull/9537), [@underhood](https://github.com/underhood)) - Fix now_ms in charts.d collector to prevent tc-qos-helper crashes ([#9510](https://github.com/netdata/netdata/pull/9510), [@ilyam8](https://github.com/ilyam8)) - Fix python.d crashes by adding a lock to stdout write function ([#9508](https://github.com/netdata/netdata/pull/9508), [@ilyam8](https://github.com/ilyam8)) - Fix an issue with random crashes when updating a chart's metadata on the fly ([#9509](https://github.com/netdata/netdata/pull/9509), [@stelfrag](https://github.com/stelfrag)) - Fix ACLK protocol version always parsed as 0 ([#9502](https://github.com/netdata/netdata/pull/9502), [@underhood](https://github.com/underhood)) - Fix the check condition for chart name change ([#9503](https://github.com/netdata/netdata/pull/9503), [@stelfrag](https://github.com/stelfrag)) - Fix the exporting engine unit tests ([#9460](https://github.com/netdata/netdata/pull/9460), [@vlvkobal](https://github.com/vlvkobal)) - Fix a Coverity defect for resource leaks ([#9462](https://github.com/netdata/netdata/pull/9462), [@vlvkobal](https://github.com/vlvkobal)) 2020-07-16T11:31:40+00:00 netdata v1.24.0 netdata v1.24.0 2020-08-10T04:08:07+00:00 # Release v1.24.0 The v1.24.0 release of the Netdata Agent brings enhancements to the breadth of metrics we collect with a new generic Prometheus/OpenMetrics collector and enhanced storage and querying with a new multi-host database mode. ## At a glance This release broadens our commitment to open standards, interoperability, and extensibility with a new generic Prometheus collector that works seamlessly with any application that makes its metrics available in the [Prometheus](https://prometheus.io/docs/instrumenting/exposition_formats/)/[OpenMetrics](https://github.com/OpenObservability/OpenMetrics) exposition format, including support for Windows 10 via [windows_exporter](https://github.com/prometheus-community/windows_exporter). Netdata will autodetect [over 600 Prometheus endpoints](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/prometheus.conf) and instantly generate charts with all the exposed metrics, meaningfully visualized. The Netdata Agent database engine enables long-term storage of per-second metrics inside the Agent using both RAM and disk space. In our new, multi-host database mode, parent and child nodes share resources in a single instance. Any pre-existing child node metrics remain in the legacy dbengine paths to ensure backward compatibility. To migrate those nodes to the new multi-host DB, simply [delete those metric cache paths](https://learn.netdata.cloud/docs/agent/database/engine#backward-compatibility). This new mode supports distributed queries for the Agent as well as specific scenarios like streaming metrics from the child to parent database, streaming multiple child nodes to a single parent, and remembering which child or children are connected to the database even if streaming hasn't started. ## Acknowledgments - [@lassebm](https://github.com/lassebm) for the FreeBSD interface error alarms - [@Saruspete](https://github.com/Saruspete) for fixing the RPM default permissions for /usr/libexec/netdata - [@Steve8291](https://github.com/Steve8291) for adjusting check-kernel-config.sh to run in bash - [@bmatheny](https://github.com/bmatheny) for adding pihole to the dns app group - [@tinyhammers](https://github.com/tinyhammers) for templatizing the health/megacli alarms ## New Features - Add generic Prometheus/OpenMetrics collector ([#9644](https://github.com/netdata/netdata/pull/9644), [@ilyam8](https://github.com/ilyam8)) - Add locking between different collectors for the same application, implemented in different technologies ([#9584](https://github.com/netdata/netdata/pull/9584), [@vlvkobal](https://github.com/vlvkobal)), ([#9564](https://github.com/netdata/netdata/pull/9564), [@ilyam8](https://github.com/ilyam8)) - Implement multihost database ([#9556](https://github.com/netdata/netdata/pull/9556), [@stelfrag](https://github.com/stelfrag)) - Add alarms for FreeBSD interface errors ([#8340](https://github.com/netdata/netdata/pull/8340), [@lassebm](https://github.com/lassebm)) ## Documentation - Add documentation to provide a comprehensive guide for package maintainers ([#9467](https://github.com/netdata/netdata/pull/9467), [@Ferroin](https://github.com/Ferroin)) - Add documentation to provide a comprehensive guide for package maintainers ([#9467](https://github.com/netdata/netdata/pull/9467), [@Ferroin](https://github.com/Ferroin)) ## Packaging / Installation - Remove delay in updater script for non-interactive runs from install scripts. ([#9589](https://github.com/netdata/netdata/pull/9589), [@Ferroin](https://github.com/Ferroin)) - Remove runtime support for Polymorphic Linux from our Docker containers. ([#9566](https://github.com/netdata/netdata/pull/9566), [@Ferroin](https://github.com/Ferroin)) - Add better checks for existing installs to the kickstart scripts. ([#9408](https://github.com/netdata/netdata/pull/9408), [@Ferroin](https://github.com/Ferroin)) - Require cloud build to succeed in make dist checks. ([#9218](https://github.com/netdata/netdata/pull/9218), [@Ferroin](https://github.com/Ferroin)) - Use the libbpf library for the eBPF plugin ([#9490](https://github.com/netdata/netdata/pull/9490), [@vlvkobal](https://github.com/vlvkobal)) - Fix Travis CI and remove deprecated/removed builds that have no upstream LXC image ([#9630](https://github.com/netdata/netdata/pull/9630), [@prologic](https://github.com/prologic)) - Fetch libbpf from netdata fork ([#9637](https://github.com/netdata/netdata/pull/9637), [@vlvkobal](https://github.com/vlvkobal)) - Fix RPM default permissions for /usr/libexec/netdata ([#9621](https://github.com/netdata/netdata/pull/9621), [@Saruspete](https://github.com/Saruspete)) - Add eBPF collector support to DEB and RPM packages. ([#9628](https://github.com/netdata/netdata/pull/9628), [@Ferroin](https://github.com/Ferroin)) - Add sandboxing exception for `/run/netdata`. ([#9613](https://github.com/netdata/netdata/pull/9613), [@Ferroin](https://github.com/Ferroin)) - Remove delay in updater script for non-interactive runs from install scripts. ([#9589](https://github.com/netdata/netdata/pull/9589), [@Ferroin](https://github.com/Ferroin)) - Add proper handling for autogen on Ubuntu 18.04 ([#9586](https://github.com/netdata/netdata/pull/9586), [@Ferroin](https://github.com/Ferroin)) - Remove runtime support for Polymorphic Linux from our Docker containers. ([#9566](https://github.com/netdata/netdata/pull/9566), [@Ferroin](https://github.com/Ferroin)) - Add CAP_SYS_RESOURCE to capability bounding set. ([#9569](https://github.com/netdata/netdata/pull/9569), [@Ferroin](https://github.com/Ferroin)) - Add better checks for existing installs to the kickstart scripts. ([#9408](https://github.com/netdata/netdata/pull/9408), [@Ferroin](https://github.com/Ferroin)) - Enable simple sandboxing on systemd service ([#9234](https://github.com/netdata/netdata/pull/9234), [@Izorkin](https://github.com/Izorkin)) - Revert the eBPF package bundling that breaks the release and DEB packages. ([#9552](https://github.com/netdata/netdata/pull/9552), [@prologic](https://github.com/prologic)) - Add libbpf patch to make dist. ([#9571](https://github.com/netdata/netdata/pull/9571), [@Ferroin](https://github.com/Ferroin)) ## Bug Fixes - charts.d: fix `current_time_ms_from_date` on macOS ([#9636](https://github.com/netdata/netdata/pull/9636), [@ilyam8](https://github.com/ilyam8)) - python.d/gearmand: handle func prefixes in `status\n` response ([#9610](https://github.com/netdata/netdata/pull/9610), [@ilyam8](https://github.com/ilyam8)) - Stop mdstat collector from looking up archived charts. ([#9583](https://github.com/netdata/netdata/pull/9583), [@mfundul](https://github.com/mfundul)) - Fixes mempcpy->memcpy ([#9575](https://github.com/netdata/netdata/pull/9575), [@underhood](https://github.com/underhood)) - charts.d.plugin: never use `-t` option for `timeout` ([#9568](https://github.com/netdata/netdata/pull/9568), [@ilyam8](https://github.com/ilyam8)) - health/megacli: change all instances of alarm to template ([#9553](https://github.com/netdata/netdata/pull/9553), [@tinyhammers](https://github.com/tinyhammers)) - Adjust check-kernel-config.sh to run in bash ([#9633](https://github.com/netdata/netdata/pull/9633), [@Steve8291](https://github.com/Steve8291)) ## Other Notable Changes - Send netdata.public.unique.id (machine GUID) with claim ([#9574](https://github.com/netdata/netdata/pull/9574), [@underhood](https://github.com/underhood)) - Add pihole to the dns app group ([#9557](https://github.com/netdata/netdata/pull/9557), [@bmatheny](https://github.com/bmatheny)) - Implemented the HOST command in metadata log replay ([#9489](https://github.com/netdata/netdata/pull/9489), [@stelfrag](https://github.com/stelfrag)) - Implemented default disk space size calculation for multihost db ([#9504](https://github.com/netdata/netdata/pull/9504), [@stelfrag](https://github.com/stelfrag)) - Suppress warning -Wformat-truncation in ACLK ([#9547](https://github.com/netdata/netdata/pull/9547), [@underhood](https://github.com/underhood)) - Dashboard improvements ([#9639](https://github.com/netdata/netdata/pull/9639), [@jacekkolasa](https://github.com/jacekkolasa)) 2020-08-10T04:08:07+00:00 netdata v1.25.0 netdata v1.25.0 2020-09-15T03:23:45+00:00 # Release v1.25.0 The v1.25.0 release of the Netdata Agent is focused on improving Netdata's usability across the board. We added more customization to how the Prometheus collector implemented in v1.24 meaningfully visualizes metrics. In addition, we've focused on fixing bugs and ensuring that core functionality of the Netdata Agent, such as the ACLK, works more efficiently. This release contains 1 new collector, 27 improvements, 15 documentation updates, and 59 bug fixes. ## At a glance Improved **filtering and grouping for the Prometheus collector** gives you more flexibility in how Netdata collects and visualizes metrics from more than 600 Prometheus endpoints. The Prometheus collector is designed to visualize every metric exposed on a Prometheus endpoint generically, but one chart for every metric is often not the most meaningful presentation. Filtering and grouping options bring the same "bespoke" feeling that you find in our other collectors, such as having input/output metrics on a single chart instead of two. You can read about [filtering](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus#time-series-selector-filtering) and [grouping](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus#time-series-grouping) in our documentation right now. If you haven't heard about the generic Prometheus collector, read our [v1.24 blog post](https://www.netdata.cloud/blog/release-1-24/) for details on why we continuously make Netdata more interoperable with other monitoring solutions. We also made significant improvements to the **robustness and responsiveness of the Agent-Cloud link** (ACLK), which is used to stream metrics and alarm status if you sign up for Netdata Cloud and claim your nodes. The disconnect and reconnect process is now more reliable, and all metrics data is now Gzip compressed. Now that the payloads are smaller and more quickly processed, you'll see improved responsiveness when viewing dashboards in [Netdata Cloud](https://app.netdata.cloud). We added a new **Elasticsearch** collector, written in Go, to help you collect metrics from and monitor Elasticsearch instances. This collector is preinstalled with the Netdata Agent and often works with zero configuration, but can also be tweaked to collect only specific stats, gather metrics with TLS, and more. See the [documentation](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/elasticsearch) for generated charts and configuration options. ## Acknowledgments We're grateful to the Netdata community for a huge wave of contributions for this release. - [@mklepaczewski](https://github.com/mklepaczewski) for adding a JSON log parser to the `go.d/web_log` collector. - [@glesys-andreas](https://github.com/glesys-andreas) for adding socket support for the `go.d/phpfpm` collector. - [@K900](https://github.com/K900) for adding and documenting how to read container names from Podman. - [@pando85](https://github.com/pando85) for fixing the link to Caddyfile's basicauth in the Docker documentation. - [@roedie](https://github.com/roedie) for improving Debian packaging by streamlining control and init files. - [@vsc55](https://github.com/vsc55) for adding support for IP ranges in the `python.d/isc_dhcpd` collector. - [@mrbarletta](https://github.com/mrbarletta) for fixing MySQL collector documentation to mention the `netdata` user. - [@Saruspete](https://github.com/Saruspete) for fixing RPM build script version issues. - [@michmach](https://github.com/michmach) for improving the uninstall script to correctly state if the group was deleted. - [@Steve8291](https://github.com/Steve8291) for removing PrivateMounts in systemd journal logs. - [@mrbrutti](https://github.com/mrbrutti) for updating `netdata-installer.sh` to enable Netdata Cloud support in macOS. - [@weijing24](https://github.com/weijing24) for adding RAM info for macOS to `system-info.sh`. - [@scottymuse](https://github.com/scottymuse) for fixing latency-avg chart units in the `python.d/dnsdist` collector. - [@Ancairon](https://github.com/Ancairon) for improving `proc.plugin` to collect the active processes limit on Linux systems. - [@scatenag](https://github.com/scatenag) for fixing TLS over LDAP in the `python.d/openldap` collector. - [@florianmagnin](https://github.com/florianmagnin) for adding new options to the `python.d/varnish` collector for multiple storage backends. - [@devinrsmith](https://github.com/devinrsmith) for fixing the print message when building for Ubuntu Focal. ## Improvements - Add code to release memory used by the global GUID map ([#9729](https://github.com/netdata/netdata/pull/9729), [@stelfrag](https://github.com/stelfrag)) - Add check for spurious wakeups ([#9751](https://github.com/netdata/netdata/pull/9751), [@vlvkobal](https://github.com/vlvkobal)) ### Netdata Cloud - Add v2 HTTP message with compression to ACLK ([#9895](https://github.com/netdata/netdata/pull/9895), [@underhood](https://github.com/underhood)) - Add version negotiation to ACLK ([#9819](https://github.com/netdata/netdata/pull/9819), [@underhood](https://github.com/underhood)) - Add `claimed_id` for child nodes streamed to their parents ([#9804](https://github.com/netdata/netdata/pull/9804), [@underhood](https://github.com/underhood)) - Update `netdata-installer.sh` to enable Netdata Cloud support in macOS ([#9360](https://github.com/netdata/netdata/pull/9360), [@mrbrutti](https://github.com/mrbrutti)) ### Collectors - Update go.d.plugin version to v0.22.0 ([#9898](https://github.com/netdata/netdata/pull/9898), [@ilyam8](https://github.com/ilyam8)) - Add JSON parser to weblog collector ([#417](https://github.com/netdata/go.d.plugin/pull/417), [@mklepaczewski](https://github.com/mklepaczewski)) - Update go.d.plugin version to v0.21.0 ([#9881](https://github.com/netdata/netdata/pull/9881), [@ilyam8](https://github.com/ilyam8)) - Add new Elasticsearch collector ([#421](https://github.com/netdata/go.d.plugin/pull/421), [@ilyam8](https://github.com/ilyam8)) - Add filtering option to Prometheus collector ([#416](https://github.com/netdata/go.d.plugin/pull/416), [@ilyam8](https://github.com/ilyam8)) - Add custom grouping option to Prometheus collector ([#418](https://github.com/netdata/go.d.plugin/pull/418), [@ilyam8](https://github.com/ilyam8)) - Add add socket support to PHP-FPM collector ([#402](https://github.com/netdata/go.d.plugin/pull/402), [@glesys-andreas](https://github.com/glesys-andreas)) - Add support for IP ranges to Python-based isc_dhcpd collector ([#9755](https://github.com/netdata/netdata/pull/9755), [@vsc55](https://github.com/vsc55)) - Add Network viewer charts to `ebpf.plugin` ([#9591](https://github.com/netdata/netdata/pull/9591), [@thiagoftsm](https://github.com/thiagoftsm)) - Add collecting active processes limit on Linux systems ([#9843](https://github.com/netdata/netdata/pull/9843), [@Ancairon](https://github.com/Ancairon)) - Improve eBPF plugin by removing unnecessary debug messages ([#9754](https://github.com/netdata/netdata/pull/9754), [@thiagoftsm](https://github.com/thiagoftsm)) - Add CAP_SYS_CHROOT for netdata service to read LXD network interfaces ([#9726](https://github.com/netdata/netdata/pull/9726), [@vlvkobal](https://github.com/vlvkobal)) - Add collecting `maxmemory` to `python.d/redis` ([#9767](https://github.com/netdata/netdata/pull/9767), [@ilyam8](https://github.com/ilyam8)) - Add option for multiple storage backends in `python.d/varnish` ([#9668](https://github.com/netdata/netdata/pull/9668), [@florianmagnin](https://github.com/florianmagnin)) ### Dashboard - Update dashboard v1.4.2 ([#9837](https://github.com/netdata/netdata/pull/9837), [@jacekkolasa](https://github.com/jacekkolasa)) - Disable calls to netdata.cloud when --disable-cloud option is used during installation ([#114](https://github.com/netdata/dashboard/pull/114), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix Y-axis and auto-scaling for constant values ([#115](https://github.com/netdata/dashboard/pull/115) & ([#117](https://github.com/netdata/dashboard/pull/117), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix broken dashboard when browser is configured to have no preferred language ([#118](https://github.com/netdata/dashboard/pull/118), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix d3-pie chart unit conversion on updates ([#119](https://github.com/netdata/dashboard/pull/119), [@jacekkolasa](https://github.com/jacekkolasa)) - Update dashboard to v1.3.1 ([#9786](https://github.com/netdata/netdata/pull/9786), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix stacked chart dimension visibility ([#113](https://github.com/netdata/dashboard/pull/113), [@jacekkolasa](https://github.com/jacekkolasa)) ### Packaging/installation - Improve handling of offline installs ([#9805](https://github.com/netdata/netdata/pull/9805), [@Ferroin](https://github.com/Ferroin)) - Improve Debian packaging by streamlining control and init files ([#8982](https://github.com/netdata/netdata/pull/8982), [@roedie](https://github.com/roedie)) - Remove dependency on libJudy for systems which don't have it ([#9859](https://github.com/netdata/netdata/pull/9859), [@Ferroin](https://github.com/Ferroin)) - Add code to bundle libJudy on systems which do not provide a usable copy of it ([#9776](https://github.com/netdata/netdata/pull/9776), [@Ferroin](https://github.com/Ferroin)) - Improve temporary directory checking in installer and updater ([#9797](https://github.com/netdata/netdata/pull/9797), [@Ferroin](https://github.com/Ferroin)) - Add proper certificate handling cURL in our static build ([#9733](https://github.com/netdata/netdata/pull/9733), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Improve and correct vulnerability reporting instructions ([#9696](https://github.com/netdata/netdata/pull/9696), [@cakrit](https://github.com/cakrit)) - Fix broken link in privacy policy ([#9771](https://github.com/netdata/netdata/pull/9771), [@joelhans](https://github.com/joelhans)) - Update supported collectors doc to organize by type ([#9513](https://github.com/netdata/netdata/pull/9513), [@joelhans](https://github.com/joelhans)) - Change instruction to reload HEALTH ([#9869](https://github.com/netdata/netdata/pull/9869), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix typo in health documentation ([#9860](https://github.com/netdata/netdata/pull/9860), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix broken `Edit this page` link in simple patterns doc ([#9847](https://github.com/netdata/netdata/pull/9847), [@joelhans](https://github.com/joelhans)) - Remove Google Charts info from API doc ([#9826](https://github.com/netdata/netdata/pull/9826), [@joelhans](https://github.com/joelhans)) - Fix broken link and clean up frontmatter in health docs ([#9813](https://github.com/netdata/netdata/pull/9813), [@joelhans](https://github.com/joelhans)) - Improve dbengine docs and add new multihost setting ([#9817](https://github.com/netdata/netdata/pull/9817), [@joelhans](https://github.com/joelhans)) - Improve health docs by adding daemon config to health section and standardizing IP references ([#8837](https://github.com/netdata/netdata/pull/8837), [@joelhans](https://github.com/joelhans)) - Add and document support for reading container names from Podman in cgroups.plugin ([#9474](https://github.com/netdata/netdata/pull/9474), [@K900](https://github.com/K900)) - Fix docker packaging caddyserver basicauth link ([#9812](https://github.com/netdata/netdata/pull/9812), [@pando85](https://github.com/pando85)) - Fix MySQL collector documentation to mention `netdata` user ([#9555](https://github.com/netdata/netdata/pull/9555), [@mrbarletta](https://github.com/mrbarletta)) - Add community link to readme ([#9602](https://github.com/netdata/netdata/pull/9602), [@zack-shoylev](https://github.com/zack-shoylev)) - Add v1.24 news to main README ([#9721](https://github.com/netdata/netdata/pull/9721), [@aabatangle](https://github.com/aabatangle)) ## Bug fixes - Fix setting the default value of the home directory to the environment's HOME ([#9711](https://github.com/netdata/netdata/pull/9711), [@cakrit](https://github.com/cakrit)) - Fix memory mode none not dropping stale dimension data ([#9917](https://github.com/netdata/netdata/pull/9917), [@mfundul](https://github.com/mfundul)) - Fix memory mode none not marking dimensions as obsolete ([#9912](https://github.com/netdata/netdata/pull/9912), [@mfundul](https://github.com/mfundul)) - Fix race condition with orphan hosts ([#9862](https://github.com/netdata/netdata/pull/9862), [@mfundul](https://github.com/mfundul)) - Fix the log level in cgroup-network helper ([#9836](https://github.com/netdata/netdata/pull/9836), [@vlvkobal](https://github.com/vlvkobal)) - Fix empty dbengine files ([#9820](https://github.com/netdata/netdata/pull/9820), [@mfundul](https://github.com/mfundul)) - Fix timestamps for global variables in Prometheus output ([#9779](https://github.com/netdata/netdata/pull/9779), [@vlvkobal](https://github.com/vlvkobal)) - Fix long stats.d chart names (suggested by @vince-lessbits) ([#9783](https://github.com/netdata/netdata/pull/9783), [@amoss](https://github.com/amoss)) - Fix HTTP header for the remote write exporting connector ([#9775](https://github.com/netdata/netdata/pull/9775), [@vlvkobal](https://github.com/vlvkobal)) - Fix netfilter to close when receiving a SIGPIPE ([#9756](https://github.com/netdata/netdata/pull/9756), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix exporting update point ([#9748](https://github.com/netdata/netdata/pull/9748), [@vlvkobal](https://github.com/vlvkobal)) - Fix for ignored LXC containers ([#9645](https://github.com/netdata/netdata/pull/9645), [@vlvkobal](https://github.com/vlvkobal)) - Fix issue with missing alarms ([#9712](https://github.com/netdata/netdata/pull/9712), [@stelfrag](https://github.com/stelfrag)) - Fix child memory corruption by removing broken optimization in the sender thread ([#9703](https://github.com/netdata/netdata/pull/9703), [@amoss](https://github.com/amoss)) - Fix crash when receiving malformed labels via streaming. ([#9715](https://github.com/netdata/netdata/pull/9715), [@mfundul](https://github.com/mfundul)) - Fix collectors on MacOS and FreeBSD to ignore archived charts. ([#9695](https://github.com/netdata/netdata/pull/9695), [@mfundul](https://github.com/mfundul)) - Fix sending follow-up alarms when the initial status matches the notification ([#9698](https://github.com/netdata/netdata/pull/9698), [@cakrit](https://github.com/cakrit)) - Fix typo in option name used to use bundled libJudy ([#9893](https://github.com/netdata/netdata/pull/9893), [@prologic](https://github.com/prologic)) - Fix handling of libJudy bundling for RPM packages ([#9875](https://github.com/netdata/netdata/pull/9875), [@Ferroin](https://github.com/Ferroin)) - Fix another typo in the libJudy bundling code ([#9904](https://github.com/netdata/netdata/pull/9904), [@Ferroin](https://github.com/Ferroin)) - Fix missing newline concatentation slash causing failures in RPM builds ([#9900](https://github.com/netdata/netdata/pull/9900), [@prologic](https://github.com/prologic)) - Fix high CPU in IPFS collector by disabling call to the `/api/v0/stats/repo` endpoint by default ([#9687](https://github.com/netdata/netdata/pull/9687), [@ilyam8](https://github.com/ilyam8)) - Fix flushing errors ([#9738](https://github.com/netdata/netdata/pull/9738), [@mfundul](https://github.com/mfundul)) - Fix bugs in handling of Python 3 dependencies on install ([#9839](https://github.com/netdata/netdata/pull/9839), [@Ferroin](https://github.com/Ferroin)) - Fix RPM build script version issues ([#9808](https://github.com/netdata/netdata/pull/9808), [@Saruspete](https://github.com/Saruspete)) - Fix installation to not install eBPF plugin components when they shouldn't be installed ([#9844](https://github.com/netdata/netdata/pull/9844), [@vlvkobal](https://github.com/vlvkobal)) - Fixed tmpdir handling failure on macOS/FreeBSD. ([#9842](https://github.com/netdata/netdata/pull/9842), [@Ferroin](https://github.com/Ferroin)) - Fix `netdata-uninstaller.sh` to correctly state whether the group was deleted ([#9835](https://github.com/netdata/netdata/pull/9835), [@michmach](https://github.com/michmach)) - Fix updater bug introduced by incomplete variable rename in #8808 ([#9834](https://github.com/netdata/netdata/pull/9834), [@Ferroin](https://github.com/Ferroin)) - Fixed bug in installer introduced by #8808 ([#9831](https://github.com/netdata/netdata/pull/9831), [@Ferroin](https://github.com/Ferroin)) - Fix systemd journal logs to remove PrivateMounts ([#9619](https://github.com/netdata/netdata/pull/9619), [@Steve8291](https://github.com/Steve8291)) - Fix netdata-updater.sh to correctly pass `REINSTALL_OPTIONS` ([#8808](https://github.com/netdata/netdata/pull/8808), [@prologic](https://github.com/prologic)) - Fix handling of offline installs ([#9805](https://github.com/netdata/netdata/pull/9805), [@Ferroin](https://github.com/Ferroin)) - Fix install if system does not have ebpf.plugin ([#9809](https://github.com/netdata/netdata/pull/9809), [@roedie](https://github.com/roedie)) - Fix packaging to enable eBPF collector only if enabled in config.h ([#9752](https://github.com/netdata/netdata/pull/9752), [@Saruspete](https://github.com/Saruspete)) - Fix numerous bugs in duplicate install handling ([#9769](https://github.com/netdata/netdata/pull/9769), [@Ferroin](https://github.com/Ferroin)) - Fix netdata/netdata Docker image size ([#9669](https://github.com/netdata/netdata/pull/9669), [@prologic](https://github.com/prologic)) - Fix global GUID map memory leak ([#9725](https://github.com/netdata/netdata/pull/9725), [@stelfrag](https://github.com/stelfrag)) - Fix buffer overflow in rrdr structure ([#9903](https://github.com/netdata/netdata/pull/9903), [@mfundul](https://github.com/mfundul)) - Fix HTTP error messages in alarm notifications ([#9887](https://github.com/netdata/netdata/pull/9887), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix lock order reversal (Coverity defect CID 361629) ([#9888](https://github.com/netdata/netdata/pull/9888), [@mfundul](https://github.com/mfundul)) - Fix missing macOS RAM info in `system-info.sh` ([#9882](https://github.com/netdata/netdata/pull/9882), [@weijing24](https://github.com/weijing24)) - Fix latency-avg chart units in `python.d/dnsdist` ([#9871](https://github.com/netdata/netdata/pull/9871), [@scottymuse](https://github.com/scottymuse)) - Fix TLS over LDAP in the `python.d/openldap` collector ([#9853](https://github.com/netdata/netdata/pull/9853), [@scatenag](https://github.com/scatenag)) - Fix multi-host DB corruption when legacy metrics reside in localhost. ([#9855](https://github.com/netdata/netdata/pull/9855), [@mfundul](https://github.com/mfundul)) - Fix compilation warnings on FreeBSD ([#9845](https://github.com/netdata/netdata/pull/9845), [@underhood](https://github.com/underhood)) - Fix proxy forwarding claim_id to old parent ([#9828](https://github.com/netdata/netdata/pull/9828), [@underhood](https://github.com/underhood)) - Fix old dashboard third-party packaging ([#9814](https://github.com/netdata/netdata/pull/9814), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix loading custom dashboard_info in /old dashboard ([#9792](https://github.com/netdata/netdata/pull/9792), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix unit tests for exporting engine ([#9766](https://github.com/netdata/netdata/pull/9766), [@vlvkobal](https://github.com/vlvkobal)) - Fix code formatting for the mdstat collector ([#9749](https://github.com/netdata/netdata/pull/9749), [@vlvkobal](https://github.com/vlvkobal)) - Fix health notifications configuration to clarify which notifications are received when the "|critical" limit is set ([#9740](https://github.com/netdata/netdata/pull/9740), [@cakrit](https://github.com/cakrit)) - Fix print message when building for Ubuntu Focal ([#9694](https://github.com/netdata/netdata/pull/9694), [@devinrsmith](https://github.com/devinrsmith)) - Fix alarm redirection link for Cloud to stop showing 404 ([#9688](https://github.com/netdata/netdata/pull/9688), [@cakrit](https://github.com/cakrit)) 2020-09-15T03:23:45+00:00 netdata v1.26.0 netdata v1.26.0 2020-10-14T14:06:54+00:00 # Release v1.26.0 The v1.26.0 release of the Netdata Agent brings exciting new collectors written in Go, a new integration with the DevOps startup StackPulse, and massive improvements to the way users navigate Netdata's documentation. We've also added compatibility with an exciting new feature that's coming soon to Netdata Cloud&mdash;stay tuned! This release contains 3 new collectors, 1 new notifcation method, 21 improvements, 13 documentation updates, and 12 bug fixes. ## At a glance The Netdata Agent can now **collect metrics from files/directories, systemd units, and ISC DHCP servers**. These new collectors are part of our larger effort to migrate _all_ collectors to Go, which provides more [extensibility](https://www.netdata.cloud/blog/software-extensibility-is-key-to-adoption/) compared to previous implementations. You can read about each of these new collectors in our docs: [filecheck](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/filecheck), [systemd](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/systemdunits), [isc_dhcpd](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/isc_dhcpd). We're excited to extend our health watchdog even further to **integrate with [StackPulse](https://stackpulse.com/)**, which is designed to help SREs manage and respond to incidents with code and automation. You can read more about how to configure Netdata to send notifications to StackPulse in the [docs](https://learn.netdata.cloud/docs/agent/health/notifications/stackpulse). We **rearchitected our docs/education site**, Netdata Learn, to focus on users actions rather than the Netdata Agent's hierarchy of code. The [core docs](https://learn.netdata.cloud/docs/) now better guide users through the most important actions, such as configuring collectors and interacting with charts, independent of whether they use only the Netdata Agent, or the Agent in combination with Netdata Cloud. Of course, all of our [reference documentation](https://learn.netdata.cloud/docs/agent) is still alive and kicking for those who want to dive into every configuration option or API query. We also revamped our [guides page](https://learn.netdata.cloud/guides) with better visuals, a search/filter, and more rational categories. ## Acknowledgments We're grateful to the Netdata community for their contributions for this release. - [@HolgerHees](https://github.com/HolgerHees) for fixing the comment syntax in Netdata's systemd file. ([#10066](https://github.com/netdata/netdata/pull/10066), - [@Saruspete](https://github.com/Saruspete) for fixing a file descriptor leak in the Infiniband colector (`proc.plugin`). - [@hamedbrd](https://github.com/hamedbrd) for adding a new Go-based systemd unit state collector and fixing gauges for the `go.d.plugin/web_log` collector. - [@chadknutson](https://github.com/chadknutson) for adding chart for churn rates to `python.d/rabbitmq`. - [@hydrogen-mvm](https://github.com/hydrogen-mvm) for adding a missing period in the Netdata dashboard. - [@roedie](https://github.com/roedie) for adding a missing libelf-dev dependency. - [@Dim-P](https://github.com/Dim-P)) and [@disko](https://github.com/disko) for documentation improvements. ## Improvements - Add the ability to send Agent alarm notifications to StackPulse. ([#9965](https://github.com/netdata/netdata/pull/9965), [@thiagoftsm](https://github.com/thiagoftsm)) - Add a way to get build configuration info from the Agent. ([#9913](https://github.com/netdata/netdata/pull/9913), [@Ferroin](https://github.com/Ferroin)) - Add chart for churn rates to `python.d/rabbitmq`. ([#10031](https://github.com/netdata/netdata/pull/10031), [@chadknutson](https://github.com/chadknutson)) - Add `failed` dim to the `connection_fails` alarm in the Portcheck alarm. ([#10048](https://github.com/netdata/netdata/pull/10048), [@ilyam8](https://github.com/ilyam8)) - Improve the data query when using the context parameter ([#9978](https://github.com/netdata/netdata/pull/9978), [@stelfrag](https://github.com/stelfrag)). - Add a context parameter to the data endpoint. ([#9931](https://github.com/netdata/netdata/pull/9931), [@stelfrag](https://github.com/stelfrag)) ### Netdata Cloud - Change default ACLK query thread count. ([#10009](https://github.com/netdata/netdata/pull/10009), [@underhood](https://github.com/underhood)) - Remove leading whitespace before JSON in ACLK. ([#9998](https://github.com/netdata/netdata/pull/9998), [@underhood](https://github.com/underhood)) - Allow using libwebsockets without SOCKS5. ([#9973](https://github.com/netdata/netdata/pull/9973), [@underhood](https://github.com/underhood)) - Add information about Cloud disabled status to `-W buildinfo`. ([#9936](https://github.com/netdata/netdata/pull/9936), [@underhood](https://github.com/underhood)) ### Collectors - Update go.d.plugin version to `v0.23.0`. ([#10046](https://github.com/netdata/netdata/pull/10046), [@ilyam8](https://github.com/ilyam8)) - Add new filecheck collector. ([go.d.plugin/#445](https://github.com/netdata/go.d.plugin/pull/445), [@ilyam8](https://github.com/ilyam8)) - Add new systemd unit state collector. ([go.d.plugin/#439](https://github.com/netdata/go.d.plugin/pull/439), [@hamedbrd](https://github.com/hamedbrd)) - Add new ISC DHCP collector. ([go.d.plugin/#451](https://github.com/netdata/go.d.plugin/pull/451), [@thiagoftsm](https://github.com/thiagoftsm)) ### Dashboard - Add missing period in Netdata dashboard. ([#9960](https://github.com/netdata/netdata/pull/9960), [@hydrogen-mvm](https://github.com/hydrogen-mvm)) - Add missing tests to the web server. ([#10008](https://github.com/netdata/netdata/pull/10008), [@thiagoftsm](https://github.com/thiagoftsm)) ### Packaging/installation - Rename `NETDATA_PORT` to `NETDATA_LISTENER_PORT`. ([#10045](https://github.com/netdata/netdata/pull/10045), [@knatsakis](https://github.com/knatsakis)) - Add a few changes that were missed by the systemd updater support. ([#10007](https://github.com/netdata/netdata/pull/10007), [@Ferroin](https://github.com/Ferroin)) - Switch to our installer's bundling code for libJudy in static installs. ([#9988](https://github.com/netdata/netdata/pull/9988), [@Ferroin](https://github.com/Ferroin)) - Add improved auto-update support. ([#9966](https://github.com/netdata/netdata/pull/9966), [@Ferroin](https://github.com/Ferroin)) - Add missing libelf-dev dependency. ([#9974](https://github.com/netdata/netdata/pull/9974), [@roedie](https://github.com/roedie)) - Update RPM spec file to use automatic dependency list generation. ([#9937](https://github.com/netdata/netdata/pull/9937), [@Ferroin](https://github.com/Ferroin)) - Add support for using `/etc/cron.d` for auto-updates. ([#9598](https://github.com/netdata/netdata/pull/9598), [@Ferroin](https://github.com/Ferroin)) - Add more stringent check for C99 support in configure script. ([#9982](https://github.com/netdata/netdata/pull/9982), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Add note about using `nolock` when debugging. ([#10036](https://github.com/netdata/netdata/pull/10036), [@andrewm4894](https://github.com/andrewm4894)) - Update claiming document to instruct users to install `uuidgen`. ([#9925](https://github.com/netdata/netdata/pull/9925), [@OdysLam](https://github.com/OdysLam)) - Fix link in exporting document. ([#10020](https://github.com/netdata/netdata/pull/10020), [@Dim-P](https://github.com/Dim-P)) - Clean up and better cross-link new `docsv2` documents. ([#10015](https://github.com/netdata/netdata/pull/10015), [@joelhans](https://github.com/joelhans)) - Update FreeBSD documentation with updated packages. ([#10005](https://github.com/netdata/netdata/pull/10005), [@disko](https://github.com/disko)) - Add documentation for claiming k8s parent pods and Prometheus service discovery. ([#10001](https://github.com/netdata/netdata/pull/10001), [@joelhans](https://github.com/joelhans)) - Add `docsv2` project to master branch. ([#10000](https://github.com/netdata/netdata/pull/10000), [@joelhans](https://github.com/joelhans)) - Fix setting for disabling eBPF-apps.plugin integration. ([#9967](https://github.com/netdata/netdata/pull/9967), [@joelhans](https://github.com/joelhans)) - Fix Stackpulse doc. ([#9968](https://github.com/netdata/netdata/pull/9968), [@thiagoftsm](https://github.com/thiagoftsm)) - Add persistent configuration details to Docker docs. ([#9926](https://github.com/netdata/netdata/pull/9926), [@joelhans](https://github.com/joelhans)) - Add guide for monitoring Pi-hole and Raspberry Pi. ([#9770](https://github.com/netdata/netdata/pull/9770), [@joelhans](https://github.com/joelhans)) - Add notice to Docker docs about systemd volumes. ([#9927](https://github.com/netdata/netdata/pull/9927), [@thiagoftsm](https://github.com/thiagoftsm)) - Add `mirrored_hosts_status` into Swagger docs. ([#9867](https://github.com/netdata/netdata/pull/9867), [@underhood](https://github.com/underhood)) ## Bug fixes - Fix systemd comment syntax. ([#10066](https://github.com/netdata/netdata/pull/10066), [@HolgerHees](https://github.com/HolgerHees)) - Fix file descriptor leak in Infiniband collector (`proc.plugin`). ([#10013](https://github.com/netdata/netdata/pull/10013), [@Saruspete](https://github.com/Saruspete)) - Fix the data endpoint to prioritize chart over context if both are present. ([#10032](https://github.com/netdata/netdata/pull/10032), [@stelfrag](https://github.com/stelfrag)) - Fix cleanup of obsolete charts. ([#9985](https://github.com/netdata/netdata/pull/9985), [@mfundul](https://github.com/mfundul)) - Fix typos in installer functions. ([#9992](https://github.com/netdata/netdata/pull/9992), [@Ferroin](https://github.com/Ferroin)) - Fix typo inside netdata-installer.sh ([#9962](https://github.com/netdata/netdata/pull/9962), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix build for the AWS Kinesis exporting connector. ([#9823](https://github.com/netdata/netdata/pull/9823), [@vlvkobal](https://github.com/vlvkobal)) - Fix incorrect condition in updater type detection. ([#10028](https://github.com/netdata/netdata/pull/10028), [@Ferroin](https://github.com/Ferroin)) - Fix gauges for `go.d.plugin/web_log` collector. ([#10029](https://github.com/netdata/netdata/pull/10029), [@hamedbrd](https://github.com/hamedbrd)) - Fix locking order to address CID_362348. ([#9991](https://github.com/netdata/netdata/pull/9991), [@stelfrag](https://github.com/stelfrag)) - Fix chart's last accessed time during context queries. ([#9952](https://github.com/netdata/netdata/pull/9952), [@stelfrag](https://github.com/stelfrag)) - Fix resource leak in case of malformed request to Netdata Cloud. ([#9934](https://github.com/netdata/netdata/pull/9934), [@underhood](https://github.com/underhood)) 2020-10-14T14:06:54+00:00 netdata v1.27.0 netdata v1.27.0 2020-12-17T14:16:12+00:00 # Release v1.27.0 The v1.27.0 release of the Netdata Agent brings dramatic improvements to long-term metrics storage via the database engine, and new dashboard features like a time & date picker for visualizing precise timeframes. Two new collectors bring incredible new value to existing features, including a bit of machine learning magic. This release contains 8 new collectors, 1 new notification method (2 others enhanced), 54 improvements, 41 documentation updates, and 58 bug fixes. ## At a glance The Netdata Agent now uses SQLite to store host, chart, and dimension metadata. This replaces the only metadata log files, which were located inside of the `/var/cache/netdata/dbengine` folder for both multihost and legacy children nodes streaming to a parent node. With SQLite powering the metadata log, you should notice faster Agent startups, as it no longer needs to replay metadata log files. The Agent no longer puts archived charts into memory on startup, further reducing memory usage. This is just the first of several improvements to the database engine and metadata log, with more coming in future releases. The database engine now uses a new extent cache that improves query time by 10% under certain workloads and reduces disk I/O by 10%. The Netdata Agent's local dashboard has received numerous improvements and bugfixes since v1.26. Perhaps most prominent is the new time & date picker, which helps you select precise timeframes when investigating an anomaly or troubleshooting an incident. See the [dashboard repository's releases](https://github.com/netdata/dashboard/releases) for the full changelog. We also introduced two new collectors that monitor the Netdata Agent itself in unique ways. First is the [**anomalies collector**](https://learn.netdata.cloud/docs/agent/collectors/python.d.plugin/anomalies), which uses machine learning (ML) to perform unsupervised anomaly detection on a node running the Netdata Agent. This collector trains itself to understand the baseline of specific charts, then charts anomalous data. A new [**alarms collector**](https://learn.netdata.cloud/docs/agent/collectors/python.d.plugin/alarms) visualizes the volume of Netdata alarms triggered over time. ## Acknowledgments - Ali Dinifar, from ZDResearch, for reporting a stack buffer overflow vulnerability in `web_client.c`, which our team resolved within 24 hours. - [@ernestojpg](https://github.com/ernestojpg) for adding the number of allocated/stored objects within each storage to the `varnish` collector. - [@ernestojpg](https://github.com/ernestojpg) for adding support for MSE (Massive Storage Engine) to the `varnish` collector. - [@jurgenhaas](https://github.com/jurgenhaas) for adding allocated space metrics to the `oracledb` collector. - [@autoalan](https://github.com/autoalan) for fixing a spelling mistake in the `haproxy` collector README. - [@ysamouhos](https://github.com/ysamouhos) for fixing a spelling mistake in UPDATE.md. - [@voriol](https://github.com/voriol) for fixing the Ansible deployment guide. - [@scatenag](https://github.com/scatenag) for adding an option to exclude zero memory allocated users to the `nvidia_smi` collector. - [@fayak](https://github.com/fayak) for adding per queue charts to the `rabbitmq` collector. - [@atnartur](https://github.com/atnartur) for fixing Markdown syntax in the custom dashboard documentation. - [@alexmyczko](https://github.com/alexmyczko) for removing redundant build dependencies from Debian control file. - [@KickerTom](https://github.com/KickerTom) for fixing compilation with HTTPS disabled. - [@hexchain](https://github.com/hexchain) for fixing a database endless loop bug when cleaning obsolete charts. - [@wash2](https://github.com/wash2) for fixing the `libreswan` collector parsing. - [@Saruspete](https://github.com/Saruspete) for fixing a platform dependent printf format. - [@KickerTom](https://github.com/KickerTom) for fixing an eBPF cross compilation error and updating libnetdata headers to be compatible with C++. - [@WBTMagnum](https://github.com/WBTMagnum) for fixing typos in the README.md. - [@Jiab77](https://github.com/Jiab77) for adding support to hide the SSO iframe. - [@martinpal](https://github.com/martinpal) for adding HBA drives support to the `hpssa` collector. - [@hamedbrd](https://github.com/hamedbrd) for fixing response and upstream response time histogram charts in the `web_log` collector. - [@hamedbrd](https://github.com/hamedbrd) for adding custom time fields feature to the `web_log` collector. - [@ski2per](https://github.com/ski2per) for adding directories size collection to the `filecheck` collector. ## Improvements - Add labels for Kubernetes pods and containers. ([#10107](https://github.com/netdata/netdata/pull/10107), [@ilyam8](https://github.com/ilyam8)) - Add `plugin` and `module` health entities. ([#10041](https://github.com/netdata/netdata/pull/10041), [@thiagoftsm](https://github.com/thiagoftsm)) - Migrate the metadata log to SQLite. ([#10139](https://github.com/netdata/netdata/pull/10139), [@stelfrag](https://github.com/stelfrag)) - Add an extent cache to the database engine. ([#10293](https://github.com/netdata/netdata/pull/10293), [@mfundul](https://github.com/mfundul)) - Added new data query option `allow_past`. ([#10112](https://github.com/netdata/netdata/pull/10112), [@stelfrag](https://github.com/stelfrag)) ### Netdata Cloud - Add the ability to query child nodes by their GUID. ([#10030](https://github.com/netdata/netdata/pull/10030), [@underhood](https://github.com/underhood)) - Add child availability messages to the ACLK. ([#9918](https://github.com/netdata/netdata/pull/9918), [@underhood](https://github.com/underhood)) - Add a metric showing how long a query spent in the queue. ([#10016](https://github.com/netdata/netdata/pull/10016), [@underhood](https://github.com/underhood)) - Completely hide the SSO iframe. ([#10027](https://github.com/netdata/netdata/pull/10027), [@Jiab77](https://github.com/Jiab77)) ### Collectors - Add alarms obsoletion and disable alarms collector by default. ([#10375](https://github.com/netdata/netdata/pull/10375), [@ilyam8](https://github.com/ilyam8)). - Add calls for `tcp_sendmsg`, `tcp_retransmit_skb`, `tcp_cleanup_rcv`, `udp_sendmsg`, `udp_recvmsg` functions charts to the eBPF collector. ([#10360](https://github.com/netdata/netdata/pull/10360), [@thiagoftsm](https://github.com/thiagoftsm)) - Add two more insignificant warnings to suppress in anomalies collector. ([#10369](https://github.com/netdata/netdata/pull/10369), [@andrewm4894](https://github.com/andrewm4894)) - Add the number of allocated/stored objects within each storage to the `varnish` collector. ([#10329](https://github.com/netdata/netdata/pull/10329), [@thiagoftsm](https://github.com/ernestojpg)) - Add a wireless statistics collector. ([#10052](https://github.com/netdata/netdata/pull/10052), [@thiagoftsm](https://github.com/thiagoftsm)) - Add support for MSE (Massive Storage Engine) to the `varnish` collector. ([#10317](https://github.com/netdata/netdata/pull/10317), [@ernestojpg](https://github.com/ernestojpg)) - Remove remove `crit` from unmatched alarms in the `web_log` collector. ([#10280](https://github.com/netdata/netdata/pull/10280), [@ilyam8](https://github.com/ilyam8)) - Add GPU key metrics (`nvidia_smi` collector) to `dashboard_info.js`. ([#10230](https://github.com/netdata/netdata/pull/10230), [@ilyam8](https://github.com/ilyam8)) - Add allocated space metrics to the `oracledb` collector. ([#10197](https://github.com/netdata/netdata/pull/10197), [@jurgenhaas](https://github.com/jurgenhaas)) - Restructure the eBPF collector to improve usability. ([#10299](https://github.com/netdata/netdata/pull/10299), [@thiagoftsm](https://github.com/thiagoftsm)) - Add an anomaly detection collector. ([#10060](https://github.com/netdata/netdata/pull/10060), [@andrewm4894](https://github.com/andrewm4894)) - Add a Netdata alarms collector. ([#10042](https://github.com/netdata/netdata/pull/10042), [@andrewm4894](https://github.com/andrewm4894)) - Add a configuration option to exclude users with zero memory allocated to the `nvidia_smi` collector. ([#10098](https://github.com/netdata/netdata/pull/10098), [@scatenag](https://github.com/scatenag)) - Add per queue charts to the `rabbitmq` collector. ([#10064](https://github.com/netdata/netdata/pull/10064), [@fayak](https://github.com/fayak)) - Add support for HBA drives to the `hpssa` collector. ([#10093](https://github.com/netdata/netdata/pull/10093), [@martinpal](https://github.com/martinpal)) - Update the `cgroups` collector default filtering by adding pod level cgroups. ([#10095](https://github.com/netdata/netdata/pull/10095), [@ilyam8](https://github.com/ilyam8)) - Add a Go version of the CouchDB collector (`couchdb`). ([go.d.plugin/#453](https://github.com/netdata/go.d.plugin/pull/453), [@vlvkobal](https://github.com/vlvkobal])) - Add collecting HTTP method per URL pattern (_url_pattern_ option) to the `web_log` collector. ([go.d.plugin/#458](https://github.com/netdata/go.d.plugin/pull/458), [@ilyam8](https://github.com/ilyam8)) - Add custom time fields feature to the `web_log` collector. ([go.d.plugin/#467](https://github.com/netdata/go.d.plugin/pull/467), [@hamedbrd](https://github.com/hamedbrd)) - Add a Go version of the PowerDNS Authoritative Nameserver collector (`powerdns`). ([go.d.plugin/#501](https://github.com/netdata/go.d.plugin/pull/501), [@ilyam8](https://github.com/ilyam8)) - Add a Go version of the PowerDNS Recursor collector (`powerdns_recursor`). ([go.d.plugin/#495](https://github.com/netdata/go.d.plugin/pull/495), [@ilyam8](https://github.com/ilyam8)) - Add a Go version of the PowerDNS DNSdist collector (`dnsdist`). ([go.d.plugin/#504](https://github.com/netdata/go.d.plugin/pull/504), [@thiagoftsm](https://github.com/thiagoftsm)) - Add a Dnsmasq DNS Forwarder collector (`dnsmasq`). ([go.d.plugin/#503](https://github.com/netdata/go.d.plugin/pull/503), [@ilyam8](https://github.com/ilyam8)) - Add collecting directories size to the `filecheck` collector. ([go.d.plugin/#487](https://github.com/netdata/go.d.plugin/pull/487), [@ski2per](https://github.com/ski2per)) - Add old systemd versions support to the `systemdunits` collector. ([go.d.plugin/#502](https://github.com/netdata/go.d.plugin/pull/502), [@ilyam8](https://github.com/ilyam8)) - Add unmatched lines logging to the `web_log` collector. ([go.d.plugin/#514](https://github.com/netdata/go.d.plugin/pull/514), [@ilyam8](https://github.com/ilyam8)) ### Notifications - Add API V2 support to the PagerDuty health integration. ([#10189](https://github.com/netdata/netdata/pull/10189), [@thiagoftsm](https://github.com/thiagoftsm)) - Add threads support to the Google Hangouts health integration. ([#10160](https://github.com/netdata/netdata/pull/10160), [@thiagoftsm](https://github.com/thiagoftsm)) - Add a Opsgenie health integration. ([#9879](https://github.com/netdata/netdata/pull/9879), [@thiagoftsm](https://github.com/thiagoftsm)) ### Exporting - Add HTTP and HTTPS support to the simple exporting connector. ([#9911](https://github.com/netdata/netdata/pull/9911), [@vlvkobal](https://github.com/vlvkobal)) ## Packaging/installation - Update React dashboard v2.11. ([#10383](https://github.com/netdata/netdata/pull/10383), [@jacekkolasa](https://github.com/jacekkolasa)) - Update go.d.plugin version to v0.26.2. ([#10355](https://github.com/netdata/netdata/pull/10355), [@ilyam8](https://github.com/ilyam8)) - Add numerous improvements to our Docker image. ([#10338](https://github.com/netdata/netdata/pull/10338), [@Ferroin](https://github.com/Ferroin)) - Use `glibtoolize` on macOS instead of regular `libtoolize`. ([#10346](https://github.com/netdata/netdata/pull/10346), [@Ferroin](https://github.com/Ferroin)) - Make the update script significantly more robust and user friendly. ([#10261](https://github.com/netdata/netdata/pull/10261), [@Ferroin](https://github.com/Ferroin)) - Update go.d.plugin version to v0.26.1. ([#10319](https://github.com/netdata/netdata/pull/10319), [@ilyam8](https://github.com/ilyam8)) - Update React dashboard v2.10.1. ([#10314](https://github.com/netdata/netdata/pull/10314), [@jacekkolasa](https://github.com/jacekkolasa)) - Update go.d.plugin version to v0.26.0 ([#10284](https://github.com/netdata/netdata/pull/10284), [@ilyam8](https://github.com/ilyam8)) - Update third-party static dependencies and use alpine 3.12. ([#10241](https://github.com/netdata/netdata/pull/10241), [@ktsaou](https://github.com/ktsaou)) - Update React dashboard to v2.9.2. ([#10239](https://github.com/netdata/netdata/pull/10239), [@jacekkolasa](https://github.com/jacekkolasa)) - Update eBPF collector to 0.4.9. ([#10202](https://github.com/netdata/netdata/pull/10202), [@thiagoftsm](https://github.com/thiagoftsm)) - Update go.d.plugin version to v0.25.0 ([#10215](https://github.com/netdata/netdata/pull/10215), [@ilyam8](https://github.com/ilyam8)) - Update React dashboard to v2.7.5. ([#10179](https://github.com/netdata/netdata/pull/10179), [@jacekkolasa](https://github.com/jacekkolasa)) - Add ability to use system libwebsockets instead of bundled version. ([#9984](https://github.com/netdata/netdata/pull/9984), [@underhood](https://github.com/underhood)) - Update the version of libJudy that we bundle to `1.0.5-netdata2`. ([#10158](https://github.com/netdata/netdata/pull/10158), [@Ferroin](https://github.com/Ferroin)) - Update React dashboard to v2.7.4. ([#10122](https://github.com/netdata/netdata/pull/10122), [@jacekkolasa](https://github.com/jacekkolasa)) - Update go.d.plugin version to v0.24.0 ([#10109](https://github.com/netdata/netdata/pull/10109), [@ilyam8](https://github.com/ilyam8)) - Remove redundant build dependencies from Debian control file. ([#10085](https://github.com/netdata/netdata/pull/10085), [@alexmyczko](https://github.com/alexmyczko)) ### CI/CD - Switch to using official Docker actions for GHA CI. ([#10364](https://github.com/netdata/netdata/pull/10364), [@Ferroin](https://github.com/Ferroin)) - Explicitly set platform for Docker builds. ([#10357](https://github.com/netdata/netdata/pull/10357), [@Ferroin](https://github.com/Ferroin)) - Update distros for CI checks and package builds. ([#10123](https://github.com/netdata/netdata/pull/10123), [@Ferroin](https://github.com/Ferroin)) - Remove usage of deprecated GHA syntax. ([#10154](https://github.com/netdata/netdata/pull/10154), [@Ferroin](https://github.com/Ferroin)) - Split ReviewDog check to only run when relevant. ([#10148](https://github.com/netdata/netdata/pull/10148), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Add documentation for time & date picker in Agent and Cloud. ([#10347](https://github.com/netdata/netdata/pull/10347), [@joelhans](https://github.com/joelhans)) - Add paragraph in anomalies collector README to ask for feedback. ([#10363](https://github.com/netdata/netdata/pull/10363), [@andrewm4894](https://github.com/andrewm4894)) - Fix typo in performance guide. ([#10386](https://github.com/netdata/netdata/pull/10386), [@OdysLam](https://github.com/OdysLam)) - Update alarms collector README with fixed image.([#10348](https://github.com/netdata/netdata/pull/10348), [@andrewm4894](https://github.com/andrewm4894)) - Update macOS instructions with new Homebrew installation command. ([#10379](https://github.com/netdata/netdata/pull/10379), [@ktsaou](https://github.com/ktsaou)) - Update macOS instructions with cmake. ([#10295](https://github.com/netdata/netdata/pull/10295), [@joelhans](https://github.com/joelhans)) - Add guide: Monitor any process in real-time with Netdata. ([#10338](https://github.com/netdata/netdata/pull/10338), [@joelhans](https://github.com/joelhans)) - Improve core documentation to align with recent Netdata Cloud releases. ([#10318](https://github.com/netdata/netdata/pull/10318), [@joelhans](https://github.com/joelhans)) - Add info about network usage requirements for the update script. ([#10334](https://github.com/netdata/netdata/pull/10334), [@Ferroin](https://github.com/Ferroin)) - Add new collectors to supported collectors list. ([#10310](https://github.com/netdata/netdata/pull/10310), [@joelhans](https://github.com/joelhans)) - Document the Agent reinstallation process. ([#10270](https://github.com/netdata/netdata/pull/10270), [@joelhans](https://github.com/joelhans)) - Add privacy information about ACLK connection. ([#10292](https://github.com/netdata/netdata/pull/10292), [@OdysLam](https://github.com/OdysLam)) - Improve `python.d` plugin PR checklist README section. ([#10302](https://github.com/netdata/netdata/pull/10302), [@andrewm4894](https://github.com/andrewm4894)) - Fix a spelling error in the HAProxy documentation. ([#10300](https://github.com/netdata/netdata/pull/10300), [@autoalan](https://github.com/autoalan)) - Fix a spelling error in the update documentation. ([#10301](https://github.com/netdata/netdata/pull/10301), [@ysamouhos](https://github.com/ysamouhos)) - Add guide: How to optimize Netdata's performance. ([#10271](https://github.com/netdata/netdata/pull/10271), [@joelhans](https://github.com/joelhans)) - Fix a syntax error in `bug_report.md`. ([#10269](https://github.com/netdata/netdata/pull/10269), [@OdysLam](https://github.com/OdysLam)) - Add new issue templates. ([#10259](https://github.com/netdata/netdata/pull/10259), [@OdysLam](https://github.com/OdysLam)) - Remove Docker example from update docs and add section to claim troubleshooting. ([#10103](https://github.com/netdata/netdata/pull/10103), [@joelhans](https://github.com/joelhans)) - Improve docs to point users to proper configuration information. ([#10254](https://github.com/netdata/netdata/pull/10254), [@joelhans](https://github.com/joelhans)) - Fix Docs GitHub Action with ignore list and update. ([#10002](https://github.com/netdata/netdata/pull/10002), [@joelhans](https://github.com/joelhans)) - Fix broken links in documentation. ([#10253](https://github.com/netdata/netdata/pull/10253), [@joelhans](https://github.com/joelhans)) - Fix a broken link in the Ansible guide. ([#10232](https://github.com/netdata/netdata/pull/10232), [@voriol](https://github.com/voriol)) - Add guide: Deploy Netdata with Ansible. ([#10199](https://github.com/netdata/netdata/pull/10199), [@joelhans](https://github.com/joelhans)) - Fix a typo in the streaming doc. ([#10225](https://github.com/netdata/netdata/pull/10225), [@ilyam8](https://github.com/ilyam8)) - Fix repeated frontmatter in exporting docs. ([#10211](https://github.com/netdata/netdata/pull/10211), [@joelhans](https://github.com/joelhans)) - Update k8s docs with new Helm repo. ([#10172](https://github.com/netdata/netdata/pull/10172), [@joelhans](https://github.com/joelhans)) - Add a warning to exporting docs about an issue with the newest gRPC versions. ([#10194](https://github.com/netdata/netdata/pull/10194), [@vlvkobal](https://github.com/vlvkobal)) - Add supported notification platforms to docs. ([#10170](https://github.com/netdata/netdata/pull/10170), [@joelhans](https://github.com/joelhans)) - Add notices to FreeBSD/pfSense docs that they are community-supported. ([#10171](https://github.com/netdata/netdata/pull/10171), [@joelhans](https://github.com/joelhans)) - Fix configuration category in the Prometheus remote write doc. ([#10145](https://github.com/netdata/netdata/pull/10145), [@OdysLam](https://github.com/OdysLam)) - Fix broken links. ([#10115](https://github.com/netdata/netdata/pull/10115), [@joelhans](https://github.com/joelhans)) - Add documentation for Cloud Overview. ([#10082](https://github.com/netdata/netdata/pull/10082), [@joelhans](https://github.com/joelhans)) - Update supported collectors list with new collectors. ([#10102](https://github.com/netdata/netdata/pull/10102), [@joelhans](https://github.com/joelhans)) - Fix formatting source code blocks in custom dashboard page. ([#10050](https://github.com/netdata/netdata/pull/10050), [@atnartur](https://github.com/atnartur)) - Add more robust documentation around updates. ([#10100](https://github.com/netdata/netdata/pull/10100), [@Ferroin](https://github.com/Ferroin)) - Update `CONTRIBUTING.md` with new meta title. ([#10252](https://github.com/netdata/netdata/pull/10252), [@joelhans](https://github.com/joelhans)) - Update the Code of Conduct and widen scope to community. ([#10186](https://github.com/netdata/netdata/pull/10186), [@OdysLam](https://github.com/OdysLam)) - Update contact information in the Code of Conduct. ([#10161](https://github.com/netdata/netdata/pull/10161), [@aabatangle](https://github.com/aabatangle)) - Fix typos in the main README. ([#10146](https://github.com/netdata/netdata/pull/10146), [@WBTMagnum](https://github.com/WBTMagnum)) - Rewrite the repository's main README. ([#10108](https://github.com/netdata/netdata/pull/10108), [@joelhans](https://github.com/joelhans)) ## Bug fixes - Fix option parsing in `kickstart.sh`. ([#10396](https://github.com/netdata/netdata/pull/10396), [@Ferroin](https://github.com/Ferroin)) - Fix handling of dependencies on Gentoo. ([#10382](https://github.com/netdata/netdata/pull/10382), [@Ferroin](https://github.com/Ferroin)) - Fix crash in the eBPF plugin by initializing variables. ([#10395](https://github.com/netdata/netdata/pull/10395), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix sending chart definition on every data collection in alarms collector. ([#10378](https://github.com/netdata/netdata/pull/10378), [@ilyam8](https://github.com/ilyam8)) - Fix a lock check. ([#10385](https://github.com/netdata/netdata/pull/10385), [@vlvkobal](https://github.com/vlvkobal)) - Fix issue with chart metadata sent multiple times over ACLK. ([#10381](https://github.com/netdata/netdata/pull/10381), [@stelfrag](https://github.com/stelfrag)) - Fix a buffer overflow when extracting information from a streaming connection. ([#10391](https://github.com/netdata/netdata/pull/10391), [@stelfrag](https://github.com/stelfrag)) - Fix hostname configuration in the exporting engine. ([#10361](https://github.com/netdata/netdata/pull/10361), [@vlvkobal](https://github.com/vlvkobal)) - Fix use of multiarch/qemu-user-static image for Docker builds. ([#10352](https://github.com/netdata/netdata/pull/10352), [@Ferroin](https://github.com/Ferroin)) - Fix handling of self-updating in updater script. ([#10352](https://github.com/netdata/netdata/pull/10352), [@Ferroin](https://github.com/Ferroin)) - Fix handling of Python dependency for RPM package. ([#10345](https://github.com/netdata/netdata/pull/10345), [@Ferroin](https://github.com/Ferroin)) - Fix handling of PowerTools repo on CentOS 8. ([#10334](https://github.com/netdata/netdata/pull/10334), [@Ferroin](https://github.com/Ferroin)) - Fix units and data source exporting options. ([#10343](https://github.com/netdata/netdata/pull/10343), [@vlvkobal](https://github.com/vlvkobal)) - Fix building libwebsockets properly on macOS. ([#10333](https://github.com/netdata/netdata/pull/10333), [@Ferroin](https://github.com/Ferroin)) - Fix exporting config. ([#10323](https://github.com/netdata/netdata/pull/10323), [@vlvkobal](https://github.com/vlvkobal)) - Fix health by disabling `used_file_descriptors` alarm. ([#10328](https://github.com/netdata/netdata/pull/10328), [@ilyam8](https://github.com/ilyam8)) - Fix GPU data filtering in the `nvidia_smi` collector. ([#10312](https://github.com/netdata/netdata/pull/10312), [@ilyam8](https://github.com/ilyam8)) - Fix username resolution in the `nvidia_smi` collector. ([#10268](https://github.com/netdata/netdata/pull/10268), [@ilyam8](https://github.com/ilyam8)) - Fix compilation with HTTPS disabled. ([#10279](https://github.com/netdata/netdata/pull/10279), [@KickerTom](https://github.com/KickerTom)) - Fix hostname when syslog is used in syslog health integration. ([#10275](https://github.com/netdata/netdata/pull/10275), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix kernel crash caused by EBPF in Ubuntu 4.18.0-25 by adding it to the reject list. ([#10262](https://github.com/netdata/netdata/pull/10262), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix streaming buffer size. ([#10240](https://github.com/netdata/netdata/pull/10240), [@vlvkobal](https://github.com/vlvkobal)) - Fix database endless loop when cleaning obsolete charts. ([#10236](https://github.com/netdata/netdata/pull/10236), [@hexchain](https://github.com/hexchain)) - Disable chart obsoletion code for archived chart creation. ([#10231](https://github.com/netdata/netdata/pull/10231), [@mfundul](https://github.com/mfundul)) - Fix Prometheus remote write exporter so that it doesn't stop when data is not available for dimension formatting. ([#10217](https://github.com/netdata/netdata/pull/10217), [@vlvkobal](https://github.com/vlvkobal)) - Fix memory calculation by moving shared from `cached` to `used` dimension. ([#10183](https://github.com/netdata/netdata/pull/10183), [@mfundul](https://github.com/mfundul)) - Fix parsing in the `libreswan` collector. ([#10190](https://github.com/netdata/netdata/pull/10190), [@wash2](https://github.com/wash2)) - Fix an infinite loop in the statsd plugin ([#10180](https://github.com/netdata/netdata/pull/10180), [@vlvkobal](https://github.com/vlvkobal)) - Fix two bugs related to version handling in install and update code. ([#10162](https://github.com/netdata/netdata/pull/10162), [@Ferroin](https://github.com/Ferroin)) - Fix builds using particular versions of Clang. ([#10155](https://github.com/netdata/netdata/pull/10155), [@Ferroin](https://github.com/Ferroin)) - Disregard host tags configuration pointer. ([#10121](https://github.com/netdata/netdata/pull/10121), [@mfundul](https://github.com/mfundul)) - Fix platform dependent printf format. ([#10120](https://github.com/netdata/netdata/pull/10120), [@Saruspete](https://github.com/Saruspete)) - Fix compile error in CentOS 6. ([#10110](https://github.com/netdata/netdata/pull/10110), [@stelfrag](https://github.com/stelfrag)) - Fix cross compilation by properly disabling eBPF detection. ([#10034](https://github.com/netdata/netdata/pull/10034), [@KickerTom](https://github.com/KickerTom)) - Fix cgroups collector resolving container names in k8s. ([#10072](https://github.com/netdata/netdata/pull/10072), [@ilyam8](https://github.com/ilyam8)) - Fix a compilation warning. ([#10320](https://github.com/netdata/netdata/pull/10320), [@vlvkobal](https://github.com/vlvkobal)) - Fix UUID_STR_LEN undefined on macOS. ([#10313](https://github.com/netdata/netdata/pull/10313), [@underhood](https://github.com/underhood)) - Fix `python.d plugin` runtime chart creation. ([#10296](https://github.com/netdata/netdata/pull/10296), [@ilyam8](https://github.com/ilyam8)) - Fix race condition in `rrdset_first_entry_t()` and `rrdset_last_entry_t()`. ([#10276](https://github.com/netdata/netdata/pull/10276), [@mfundul](https://github.com/mfundul)) - Fix the data endpoint so that the context param is correctly applied to children. ([#10290](https://github.com/netdata/netdata/pull/10290), [@stelfrag](https://github.com/stelfrag)) - Fix Coverity errors (CID 364045,364046). ([#10282](https://github.com/netdata/netdata/pull/10282), [@stelfrag](https://github.com/stelfrag)) - Fix the `elasticsearch_last_collected` alarm. ([#10226](https://github.com/netdata/netdata/pull/10226), [@ilyam8](https://github.com/ilyam8)) - Fix spelling error in `xenstat.plugin`. ([#10224](https://github.com/netdata/netdata/pull/10224), [@ilyam8](https://github.com/ilyam8)) - Fix chart filtering. ([#10218](https://github.com/netdata/netdata/pull/10218), [@vlvkobal](https://github.com/vlvkobal)) - Fix Coverity issues. ([#10216](https://github.com/netdata/netdata/pull/10216), [@vlvkobal](https://github.com/vlvkobal)) - Fix libnetdata headers to be compatible with C++. ([#10185](https://github.com/netdata/netdata/pull/10185), [@KickerTom](https://github.com/KickerTom)) - Fix registry responses to remove caching. ([#10181](https://github.com/netdata/netdata/pull/10181), [@cakrit](https://github.com/cakrit)) - Fix eBPF memory management. ([#10096](https://github.com/netdata/netdata/pull/10096), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix overlapping memory issue. ([#10097](https://github.com/netdata/netdata/pull/10097), [@mfundul](https://github.com/mfundul)) - Fix response and upstream response time histogram charts in the `web_log` collector. ([go.d.plugin/#462](https://github.com/netdata/go.d.plugin/pull/462), [@hamedbrd](https://github.com/hamedbrd)) - Fix logs timestamps always in UTC issue in the `go.d.plugin` ([go.d.plugin#460](https://github.com/netdata/go.d.plugin/pull/460), [@ilyam8](https://github.com/ilyam8)) - Fix collecting slave status for MariaDB v10.2.0- in the `mysql` collector ([go.d.plugin#465](https://github.com/netdata/go.d.plugin/pull/465), [@ilyam8](https://github.com/ilyam8)) - Fix _cumulative_stats_ configuration option in the `unbound` collector ([go.d.plugin#478](https://github.com/netdata/go.d.plugin/pull/478), [@ilyam8](https://github.com/ilyam8)) - Fix parsing configuration file (respect 'include-toplevel' directive) in `unbound` collector ([go.d.plugin#480](https://github.com/netdata/go.d.plugin/pull/480), [@ilyam8](https://github.com/ilyam8)) - Fix handling charts with type.id >= 200 (netdata limit) in `go.d.plugin` ([go.d.plugin#472](https://github.com/netdata/go.d.plugin/pull/472), [@ilyam8](https://github.com/ilyam8)) - Fix parsing version query response in the `mysql` collector ([go.d.plugin#498](https://github.com/netdata/go.d.plugin/pull/498), [@ilyam8](https://github.com/ilyam8)) - Fix `Netsplits` chart dimensions algorithm in the the `vernemq` collector. ([go.d.plugin#511](https://github.com/netdata/go.d.plugin/pull/511), [@ilyam8](https://github.com/ilyam8)) - Fix a typo in `dashboard_info.js` for VerneMQ. ([#10223](https://github.com/netdata/netdata/pull/10223), [@ilyam8](https://github.com/ilyam8)) 2020-12-17T14:16:12+00:00 netdata v1.28.0 netdata v1.28.0 2020-12-18T15:18:12+00:00 Release v1.28.0 is a hotfix release to address a deadlock in the Netdata Agent. We intended to release this hotfix as `v1.27.1`, but we can't backtrack on a release once we've begun to publish new Docker images and binary packages on other platforms. If the Agent-Cloud link (ACLK) connection drops and the Agent fails to queue an `on_connect` message, it also fails to properly release a lock in the web server thread. ## Bug fix - Fix locking after `on_connect` failure. ([#10401](https://github.com/netdata/netdata/pull/10401), [@stelfrag](https://github.com/stelfrag)) 2020-12-18T15:18:12+00:00 netdata v1.29.0 netdata v1.29.0 2021-02-03T13:57:35+00:00 # Release v1.29.0 The v1.29.0 release of the Netdata Agent is a maintenance release that brings incremental but necessary improvements that make your monitoring experience more robust. We've pushed improvements and bug fixes to the installation and update scripts, enriched our library of collectors, and focused on fixing bugs reported by the community. This release contains 2 new collectors, the migration of 3 collectors from Python to Go, 25 other improvements, 25 documentation updates, and 26 bug fixes. ## At a glance Netdata now collects and meaningfully organizes metrics from both the [Couchbase](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/couchbase) JSON document database and the [`nginx-module-vts` module](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginxvts) for exposing metrics about NGINX virtual hosts. Click either of the links to head straight into the documentation that explains what they collect and how to configure both based on whether they're collecting over `localhost` or across nodes. We've also migrated more collectors from Python to Go in our continued efforts to make data collection faster and more robust. The newest effort includes our [Redis](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis), [Pika](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/pika), and [Energi Core Wallet](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/energid) collectors. On the dashboard, we improved the responsiveness of panning forward and backward through historical metrics data by preventing unnecessary updates and reducing the number of calls. The charts should also now immediately update when you stop panning. ## Acknowledgments - [@slavox](https://github.com/slavox) for fixing temperature parsing in the `hddtemp` collector. - [@skibbipl](https://github.com/skibbipl) for issuing a fix for the updater. - [@jsoref](https://github.com/jsoref) for the huge number of spelling fixes. - [@nabijaczleweli](https://github.com/nabijaczleweli) for the fix in the `diskplace` plugin. - [@Steve8291](https://github.com/Steve8291) for the documentation fix on using the Bash shell for debugging. - [@kdvlr](https://github.com/kdvlr) for the added instructions on Telegram notifications. - [@grinapo](https://github.com/grinapo) for the improvement on the Python-based Fail2Ban collector. - [@dpsy4](https://github.com/dpsy4) for the support for per series styling for dygraphs - [@ski2per](https://github.com/ski2per) for adding `nginxvts` collector - [@hamedbrd](https://github.com/hamedbrd) for adding `couchbase` collector - [@g3offrey](https://github.com/g3offrey) for improving `prometheus` collector default configuration ## Improvements - Reduce the number of alarm updates on ACLK. ([#10524](https://github.com/netdata/netdata/pull/10524), [@stelfrag](https://github.com/stelfrag)) - Remove unused entries from structures. ([#10519](https://github.com/netdata/netdata/pull/10519), [@stelfrag](https://github.com/stelfrag)) - Improve the retry/backoff during claiming. ([#10482](https://github.com/netdata/netdata/pull/10482), [@underhood](https://github.com/underhood)) - Support multiple chart label keys in data queries. ([#10483](https://github.com/netdata/netdata/pull/10483), [@stelfrag](https://github.com/stelfrag)) - Truncate excessive information from titles for `apps` and `cgroups` collectors. ([#10479](https://github.com/netdata/netdata/pull/10479), [@vlvkobal](https://github.com/vlvkobal)) - Use mguid instead of hostname in the ACLK collector list. ([#10394](https://github.com/netdata/netdata/pull/10394), [@underhood](https://github.com/underhood)) - Cleanup and minor fixes to eBPF collector. ([#10434](https://github.com/netdata/netdata/pull/10434), [@thiagoftsm](https://github.com/thiagoftsm)) - Add `_is_k8s_node` label to the host labels. ([#10501](https://github.com/netdata/netdata/pull/10501), [@Ilyam8](https://github.com/ilyam8)) - Move ACLK into a legacy subfolder. ([#10265](https://github.com/netdata/netdata/pull/10265), [@underhood](https://github.com/underhood)) - Exclude autofs by default in the `diskspace` plugin. ([#10441](https://github.com/netdata/netdata/pull/10441), [@nabijaczleweli](https://github.com/nabijaczleweli)) - Mark internal functions as static in health code. ([#10518](https://github.com/netdata/netdata/pull/10518), [@vkalintiris](https://github.com/vkalintiris)) - Remove unused struct in health code. ([#10517](https://github.com/netdata/netdata/pull/10517), [@vkalintiris](https://github.com/vkalintiris)) - Add support for per series styling for dygraphs. ([#8668](https://github.com/netdata/netdata/pull/8668), [@dpsy4](https://github.com/dpsy4)) ## Dashboard - Fix minor vulnerability alert by updating `socket-io` dependency. ([#10557](https://github.com/netdata/netdata/pull/10557), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix dygraph panning responsiveness, chart heights and performance improvements. ([#10520](https://github.com/netdata/netdata/pull/10520), [@jacekkolasa](https://github.com/jacekkolasa)) - Make legend position configurable. ([#10565](https://github.com/netdata/netdata/pull/10565), [@jacekkolasa](https://github.com/jacekkolasa)) ## Collectors - Add Go version of the `redis` collector. ([go.d.plugin#518](https://github.com/netdata/go.d.plugin/pull/518), [@Ilyam8](https://github.com/ilyam8)) - Add Go version of the `pika` collector. ([go.d.plugin#518](https://github.com/netdata/go.d.plugin/pull/518), [@Ilyam8](https://github.com/ilyam8)) - Add Go version of the `energis` collector. ([go.d.plugin#524](https://github.com/netdata/go.d.plugin/pull/524), [@thiagoftsm](https://github.com/thiagoftsm)) - Add a new `nginxvts` collector. ([go.d.plugin#523](https://github.com/netdata/go.d.plugin/pull/523), [@ski2per](https://github.com/ski2per)) - Add a new `couchbase` collector. ([go.d.plugin#530](https://github.com/netdata/go.d.plugin/pull/530), [@hamedbrd](https://github.com/hamedbrd)) - Add Traefik v2 to the `prometheus` collector default configuration. ([go.d.plugin#539](https://github.com/netdata/go.d.plugin/pull/539), [@g3offrey](https://github.com/g3offrey)) - Add an `expected_prefix` configuration option to the `prometheus` collector. ([go.d.plugin#541](https://github.com/netdata/go.d.plugin/pull/541), [@Ilyam8](https://github.com/ilyam8)) - Add patterns support to the `filecheck` collector. ([go.d.plugin#538](https://github.com/netdata/go.d.plugin/pull/538), [@Ilyam8](https://github.com/ilyam8)) ## Packaging and installation - Properly handle arguments and responses for triggering Docker builds. ([#10545](https://github.com/netdata/netdata/pull/10545), [@Ferroin](https://github.com/Ferroin)) - Properly handle saved temporary directory on updates. ([#10550](https://github.com/netdata/netdata/pull/10550), [@Ferroin](https://github.com/Ferroin)) - Update go.d.plugin version to v0.27.0. ([#10544](https://github.com/netdata/netdata/pull/10544), [@Ilyam8](https://github.com/ilyam8)) - Update messages about checksum validation failures on install. ([#10448](https://github.com/netdata/netdata/pull/10448), [@Ferroin](https://github.com/Ferroin)) - Switch to using system libwebsockets for RPM builds. ([#10507](https://github.com/netdata/netdata/pull/10507), [@Ferroin](https://github.com/Ferroin)) - Persist `$TMPDIR` from installer to updater. ([#10384](https://github.com/netdata/netdata/pull/10384), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Make some tweaks/improvements to configure docs. ([#10528](https://github.com/netdata/netdata/pull/10528), [@joelhans](https://github.com/joelhans)) - Update Postgres collector doc to clarify how to install a required package. ([#10532](https://github.com/netdata/netdata/pull/10532), [@OdysLam](https://github.com/OdysLam)) - Add link to specific feedback megathread for the anomalies collector. ([#10506](https://github.com/netdata/netdata/pull/10506), [@andrewm4894](https://github.com/andrewm4894)) - Update the instructions on how to install Netdata on pfSense. ([#10466](https://github.com/netdata/netdata/pull/10466), [@OdysLam](https://github.com/OdysLam)) - Fix documentation spelling mistakes ([#10508](https://github.com/netdata/netdata/pull/10508), [@jsoref](https://github.com/jsoref)) - Add guide: Monitor and visualize anomalies with Netdata. ([#10480](https://github.com/netdata/netdata/pull/10480), [@joelhans](https://github.com/joelhans)) - Add instructions on enabling explicitly disabled collectors ([#10418](https://github.com/netdata/netdata/pull/10418), [@joelhans](https://github.com/joelhans)) - Mention PostgreSQL Prometheus Adapter in the documentation ([#10487](https://github.com/netdata/netdata/pull/10487), [@vlvkobal](https://github.com/vlvkobal)) - Fix a typo in the python mysql documentation ([#10467](https://github.com/netdata/netdata/pull/10467), [@OdysLam](https://github.com/OdysLam)) - Fixes for SEO housekeeping/improvements ([#10468](https://github.com/netdata/netdata/pull/10468), [@joelhans](https://github.com/joelhans)) - Add guide: Detect anomalies in nodes and applications with Netdata ([#10451](https://github.com/netdata/netdata/pull/10451), [@joelhans](https://github.com/joelhans)) - Docs housekeeping for SEO and syntax, part 1 ([#10388](https://github.com/netdata/netdata/pull/10388), [@joelhans](https://github.com/joelhans)) - Small updates, improvements, and housekeeping to docs ([#10405](https://github.com/netdata/netdata/pull/10405), [@joelhans](https://github.com/joelhans)) - Change links at bottom of all install docs ([#10416](https://github.com/netdata/netdata/pull/10416), [@joelhans](https://github.com/joelhans)) - Add missing section to Netdata style guide. ([#10453](https://github.com/netdata/netdata/pull/10453), [@joelhans](https://github.com/joelhans)) - Update and improve the Netdata style guide. ([#10433](https://github.com/netdata/netdata/pull/10433), [@joelhans](https://github.com/joelhans)) - Improve configuration docs with common changes and start/stop/restart directions. ([#10415](https://github.com/netdata/netdata/pull/10415), [@joelhans](https://github.com/joelhans)) - Add instructions on which file to edit for Telegram. ([#10398](https://github.com/netdata/netdata/pull/10398), [@kdvlr](https://github.com/kdvlr)) - Add centralized Cloud notifications to core docs. ([#10374](https://github.com/netdata/netdata/pull/10374), [@joelhans](https://github.com/joelhans)) - Fixes for SEO housekeeping/improvements. ([#10468](https://github.com/netdata/netdata/pull/10468), [@joelhans](https://github.com/joelhans)) - Change links at bottom of all install docs. ([#10416](https://github.com/netdata/netdata/pull/10416), [@joelhans](https://github.com/joelhans)) - GitHub action markdown link check update. ([#10474](https://github.com/netdata/netdata/pull/10474), [@jsoref](https://github.com/jsoref)) - Change linting standard for Markdown lists. ([#10371](https://github.com/netdata/netdata/pull/10371), [@joelhans](https://github.com/joelhans)) - Update main README with release news. ([#10412](https://github.com/netdata/netdata/pull/10412), [@joelhans](https://github.com/joelhans)) - Improve the instructions on how to use the bash shell as user `netdata` for debugging. ([#10425](https://github.com/netdata/netdata/pull/10425), [@Steve8291](https://github.com/Steve8291)) ## Bug fixes - Fix Docker image tagging for nightly builds. ([#10584](https://github.com/netdata/netdata/pull/10584), [@Ferroin](https://github.com/Ferroin)) - Fix container detection from `systemd-detect-virt`. ([#10569](https://github.com/netdata/netdata/pull/10569), [@cakrit](https://github.com/cakrit)) - Fix Netdata Cloud support in RPM packages. ([#10578](https://github.com/netdata/netdata/pull/10578), [@Ferroin](https://github.com/Ferroin)) - Fix handling of TLS config so that cURL works in all cases. ([#10491](https://github.com/netdata/netdata/pull/10491), [@Ferroin](https://github.com/Ferroin)) - Fix function name in updater script. ([#10462](https://github.com/netdata/netdata/pull/10462), [@Ferroin](https://github.com/Ferroin)) - Fix handling of environment file in updater script. ([#10447](https://github.com/netdata/netdata/pull/10447), [@Ferroin](https://github.com/Ferroin)) - Fix bundling of libwebsockets in binary packages. ([#10460](https://github.com/netdata/netdata/pull/10460), [@Ferroin](https://github.com/Ferroin)) - Fix for the updater to use Python3 if Python is not available. ([#10424](https://github.com/netdata/netdata/pull/10424), [@skibbipl](https://github.com/skibbipl)) - Fix disconnect message sent via ACLK on agent shutdown ([#10563](https://github.com/netdata/netdata/pull/10563), [@underhood](https://github.com/underhood)) - Fix prometheus remote write header ([#10560](https://github.com/netdata/netdata/pull/10560), [@vlvkobal](https://github.com/vlvkobal)) - Fix values in Prometheus export for metrics, collected by the Prometheus collector ([#10551](https://github.com/netdata/netdata/pull/10551), [@vlvkobal](https://github.com/vlvkobal)) - Fix handling spaces in labels values in the `prometheus` collector ([go.d.plugin#537](https://github.com/netdata/go.d.plugin/pull/537), [@Ilyam8](https://github.com/ilyam8)) - Fix `mysql.slave_status` alarm for go mysql collector ([#10513](https://github.com/netdata/netdata/pull/10513), [@Ilyam8](https://github.com/ilyam8)) - Make mdstat_mismatch_cnt alarm less strict ([#10488](https://github.com/netdata/netdata/pull/10488), [@Ilyam8](https://github.com/ilyam8)) - Dispatch cgroup discovery into another thread ([#10399](https://github.com/netdata/netdata/pull/10399), [@vlvkobal](https://github.com/vlvkobal)) - Fix data source option for Prometheus web API in exporting configuration ([#10397](https://github.com/netdata/netdata/pull/10397), [@vlvkobal](https://github.com/vlvkobal)) - Add Realtek network cards to the list of physical interfaces on FreeBSD ([#10414](https://github.com/netdata/netdata/pull/10414), [@vlvkobal](https://github.com/vlvkobal)) - Fix anomalies collector custom model bug ([#10459](https://github.com/netdata/netdata/pull/10459), [@andrewm4894](https://github.com/andrewm4894)) - Fix broken dbengine stress tests. ([#10502](https://github.com/netdata/netdata/pull/10502), [@mfundul](https://github.com/mfundul)) - Fix segmentation fault in the agent ([#10498](https://github.com/netdata/netdata/pull/10498), [@mfundul](https://github.com/mfundul)) - Fix memory allocation when computing standard deviation ([#10484](https://github.com/netdata/netdata/pull/10484), [@stelfrag](https://github.com/stelfrag)) - Fix temperature parsing in the hddtemp collector ([#10429](https://github.com/netdata/netdata/pull/10429), [@slavox](https://github.com/slavox)) - Fix postgres password bug and change default config ([#10531](https://github.com/netdata/netdata/pull/10531), [@OdysLam](https://github.com/OdysLam)) - Add handling "yes" and "no" and flexible space match in the python.d/fail2ban plugin ([#10400](https://github.com/netdata/netdata/pull/10400), [@grinapo](https://github.com/grinapo)) - Fix for older compilers ([#10470](https://github.com/netdata/netdata/pull/10470), [@underhood](https://github.com/underhood)) - Fix spelling mistakes in the Python plugin and documentation. ([#10525](https://github.com/netdata/netdata/pull/10525), [@jsoref](https://github.com/jsoref)) 2021-02-03T13:57:35+00:00 netdata v1.29.1 netdata v1.29.1 2021-02-09T13:05:35+00:00 Release v1.29.1 is a hotfix release to address a crash in the Netdata Agent. A locking bug in one of the internal collectors in Netdata could cause it to crash during shutdown in a way that would result in the Netdata Agent taking an excessively long time to exit. ## Bug Fixes - Fix crash during shutdown of cgroups internal plugin. ([#10614](https://github.com/netdata/netdata/pull/10614), [@mfundul](https://github.com/mfundul)) 2021-02-09T13:05:35+00:00 netdata v1.29.2 netdata v1.29.2 2021-02-18T15:18:43+00:00 # Release v1.29.2 Release v1.29.2 is a patch release to improve the stability of the Netdata Agent. We discovered that an improvement introduced in v1.29.0 could inadvertently set all `os_*` host labels to `unknown`, which could affect users who leverage these host labels to organize their nodes, deploy health entities, or export metrics to external time-series databases. This bug was fixed in #10647. This release also contains additional bug fixes and improvements. ## Acknowledgments - [@tinyhammers](https://github.com/tinyhammers) for making the Opsgenie API URL configurable. - [@vjt](https://github.com/vjt) for documenting the scheme option in the elastic search collector. - [@rda0](https://github.com/rda0) for the fix that does not allow binary data to be printed. - [@fayak](https://github.com/fayak) for adding freeswitch to the `app_groups.conf` file. ## Improvements - Make the Opsgenie `API URL` configurable. ([#10561](https://github.com/netdata/netdata/pull/10561), [@tinyhammers](https://github.com/tinyhammers)) - Add `k8s_cluster_id` host label. ([#10588](https://github.com/netdata/netdata/pull/10588), [@ilyam8](https://github.com/ilyam8)) - Enable `apps.plugin` aggregation debug messages. ([#10645](https://github.com/netdata/netdata/pull/10645), [@vlvkobal](https://github.com/vlvkobal)) - Add configuration parameter to disable stock alarms. ([#10617](https://github.com/netdata/netdata/pull/10617), [@thiagoftsm](https://github.com/thiagoftsm)) - Add ACLK proxy setting as host label. ([#10619](https://github.com/netdata/netdata/pull/10619), [@underhood](https://github.com/underhood)) - Add freeswitch to `apps_groups.conf`. ([#10621](https://github.com/netdata/netdata/pull/10621), [@fayak](https://github.com/fayak)) - Simplify thread creation and remove unnecessary variables in the eBPF plugin. ([#10442](https://github.com/netdata/netdata/pull/10442), [@thiagoftsm](https://github.com/thiagoftsm)) ## Documentation - Fix a typo in `web/gui/readme.md`. ([#10623](https://github.com/netdata/netdata/pull/10623), [@OdysLam](https://github.com/OdysLam)) - Add resetting `CapabilityBoundingSet` workaround to the python.d collectors (that use `sudo`). ([#10587](https://github.com/netdata/netdata/pull/10587), [@ilyam8](https://github.com/ilyam8)) - Document the `scheme` option in the Elasticsearch collector. ([#10572](https://github.com/netdata/netdata/pull/10572), [@vjt](https://github.com/vjt)) - Update claiming instructions for Docker containers. ([#10570](https://github.com/netdata/netdata/pull/10570), [@Ferroin](https://github.com/Ferroin)) ## Bug fixes - Fix the context filtering on the data query endpoint. ([#10652](https://github.com/netdata/netdata/pull/10652), [@stelfrag](https://github.com/stelfrag)) - Fix container/host detection in the `system-info.sh` script. ([#10647](https://github.com/netdata/netdata/pull/10647), [@ilyam8](https://github.com/ilyam8)) - Add a small delay to the `ipv4_tcp_resets` alarms. ([#10644](https://github.com/netdata/netdata/pull/10644), [@ilyam8](https://github.com/ilyam8)) - Fix collecting operstate for virtual network interfaces. ([#10633](https://github.com/netdata/netdata/pull/10633), [@ilyam8](https://github.com/ilyam8)) - Fix sendmail unrecognized option F error. ([#10631](https://github.com/netdata/netdata/pull/10631), [@ilyam8](https://github.com/ilyam8)) - Fix so that raw binary data should never be printed. ([#10603](https://github.com/netdata/netdata/pull/10603), [@rda0](https://github.com/rda0)) - Change KSM memory chart type to stacked. ([#10598](https://github.com/netdata/netdata/pull/10598), [@ilyam8](https://github.com/ilyam8)) - Allow the `REMOVED` alarm status via ACLK if the previous status was `WARN`/`CRIT`. ([#10533](https://github.com/netdata/netdata/pull/10533), [@stelfrag](https://github.com/stelfrag)) - Reduce excessive logging in the ACLK. ([#10596](https://github.com/netdata/netdata/pull/10596), [@underhood](https://github.com/underhood)) 2021-02-18T15:18:43+00:00 netdata v1.29.3 netdata v1.29.3 2021-02-23T13:54:01+00:00 # Release v1.29.3 Release v1.29.3 is a patch release to improve the stability of the Netdata Agent. We discovered a bug that when `proc.plugin` attempts to collect the `operstate` parameter for a virtual network interface. If the chart is obsoleted, the Netdata Agent crashes. This bug was fixed in #10667. We're grateful to @gaia for first identifying this issue and working with our engineers, along with @sdellenb, to provide logs and point us toward the source of the bug. This release also contains additional bug fixes and improvements. ## Packaging/installation - Fixed condition controlling use of static LWS in RPM builds. ([#10661](https://github.com/netdata/netdata/pull/10661), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Improve the StatsD documentation and associated Statsd dashboard improvements. ([#10640](https://github.com/netdata/netdata/pull/10640), [@OdysLam](https://github.com/OdysLam)) - Fix broken links in docs and add collectors to list. ([#10651](https://github.com/netdata/netdata/pull/10651), [@joelhans](https://github.com/joelhans)) - Fix wrong link on docs Netdata Agent daemon. ([#10659](https://github.com/netdata/netdata/pull/10659), [@OdysLam](https://github.com/OdysLam)) ## Bug fixes - Fix `proc.plugin` to invalidate `RRDSETVAR` pointers on obsoletion. ([#10667](https://github.com/netdata/netdata/pull/10667), [@mfundul](https://github.com/mfundul)) 2021-02-23T13:54:01+00:00 netdata v1.30.0 netdata v1.30.0 2021-03-31T12:30:55+00:00 The v1.30.0 release of Netdata brings major improvements to our packaging and completely replaces Google Analytics/GTM for product telemetry. We're also releasing the first changes in an upcoming overhaul to both our dashboard UI/UX and the suite of preconfigured alarms that comes with every installation. v1.30.0 contains 3 new collectors, 3 enhancements to notifications method, 38 improvements (13 in the dashboard), 16 documentation updates, and 17 bug fixes. ## At a glance The **ACLK-NG** is a much faster method of securely connecting a node to Netdata Cloud. In addition, there are no external dependencies to our custom [libmosquitto](https://github.com/netdata/mosquitto) and [libwebsockets](https://github.com/warmcat/libwebsockets) libraries, which means there's no more need to build these during installation. To enable ACLK-NG on a node that's already running the Netdata Agent, reinstall with the `--aclk-ng` option: ```bash bash <(curl -Ss https://my-netdata.io/kickstart.sh) --aclk-ng --reinstall ``` We **replaced Google Analytics/GTM**, which we used for collecting product telemetry, with a self-hosted instance of the open-source [PostHog](https://posthog.com/) project. When sending statistics to PostHog, any fields that might contain identifiable information, such as an IP address or URL, are hardcoded. If you previously opted-out of anonymous statistics, this migration does not change your existing settings. We also published a **developer environment** (devenv) to simplify contributing to the Netdata Agent. The devenv packages everything you need to develop improvements on the Netdata Agent itself, or its collectors, in a single Docker image. Read more about this devenv, and get started, in the [Netdata community repo](https://github.com/netdata/community/tree/main/devenv). ## Acknowledgments - [@aazedo](https://github.com/aazedo) for adding collection of attribute 233 (Media Wearout Indicator (SSD)) to the smartd_log collector - [@ossimantylahti](https://github.com/ossimantylahti) for fixing a typo in the email notifications readme - [@KickerTom](https://github.com/KickerTom) for renaming abs to ABS to avoid clash with standard definitions - [@Steve8291](https://github.com/Steve8291) for improving email, cron and ups groups in the apps_group.conf - [@liepumartins](https://github.com/liepumartins) for adding wireguard to the vpn group in the apps_group.conf - [@eltociear](https://github.com/eltociear) for fixing typos in main.h, backend_prometheus.c and dashboard_info.js - [@Habetdin](https://github.com/Habetdin) for fixing broken external links in the WEB GUI - [@salazarp](https://github.com/salazarp) for updating the syntax for Caddy v2 - [@RaitoBezarius](https://github.com/RaitoBezarius) for adding support to change IRC_PORT ## Improvements - Support VS Code container devenv. ([#10723](https://github.com/netdata/netdata/pull/10723), [@OdysLam](https://github.com/OdysLam)) - Add check for children connecting to a parent agent with an unsupported memory mode. ([#10787](https://github.com/netdata/netdata/pull/10787), [@stelfrag](https://github.com/stelfrag)) - Add lock check to avoid shutdown when compiled with internal and locking checks. ([#10835](https://github.com/netdata/netdata/pull/10835), [@stelfrag](https://github.com/stelfrag)) - Update chart's metadata in database when it already exists during creation. ([#10728](https://github.com/netdata/netdata/pull/10728), [@stelfrag](https://github.com/stelfrag)) - ACLK separate HTTPS client. ([#10784](https://github.com/netdata/netdata/pull/10784), [@underhood](https://github.com/underhood)) - Add new ACLK implementation (`ACLK-NG`). ([#10315](https://github.com/netdata/netdata/pull/10315), [@underhood](https://github.com/underhood)) - Add CPU statistics per ALCK query thread. ([#10634](https://github.com/netdata/netdata/pull/10634), [@MrZammler](https://github.com/MrZammler)) - Add `_aclk_impl` label to the `/api/v1/info` endpoint. ([#10778](https://github.com/netdata/netdata/pull/10778), [@underhood](https://github.com/underhood)) - Add a new `chart` parameter to the `/api/v1/alarm_log` endpoint. ([#10788](https://github.com/netdata/netdata/pull/10788), [@MrZammler](https://github.com/MrZammler)) - Add data query support for archived charts. ([#10771](https://github.com/netdata/netdata/pull/10771), [@stelfrag](https://github.com/stelfrag)) - Add HTTP cookie (SameSite, Secure). ([#10676](https://github.com/netdata/netdata/pull/10676), [@thiagoftsm](https://github.com/thiagoftsm)) - Add statistics per Cloud query type. ([#10602](https://github.com/netdata/netdata/pull/10602), [@underhood](https://github.com/underhood)) - Add support for changing the number of pages per database engine extent. ([#10593](https://github.com/netdata/netdata/pull/10593), [@mfundul](https://github.com/mfundul)) - Add the ability to store chart labels in the database. ([#10718](https://github.com/netdata/netdata/pull/10718), [@stelfrag](https://github.com/stelfrag)) - Enable metadata persistence in all memory modes. ([#10742](https://github.com/netdata/netdata/pull/10742), [@stelfrag](https://github.com/stelfrag)) - Increase `curl connect-timeout` and decrease number of claim attempts. ([#10800](https://github.com/netdata/netdata/pull/10800), [@ilyam8](https://github.com/ilyam8)) - Increase the ACLK exponential backoff randomness. ([#10373](https://github.com/netdata/netdata/pull/10373), [@underhood](https://github.com/underhood)) - Log ACLK Cloud commands to `access.log`. ([#10697](https://github.com/netdata/netdata/pull/10697), [@stelfrag](https://github.com/stelfrag)) - Remove an unused function warning in legacy version of the ACLK. ([#10731](https://github.com/netdata/netdata/pull/10731), [@underhood](https://github.com/underhood)) - Remove unreachable #else directives in plugins. ([#10523](https://github.com/netdata/netdata/pull/10523), [@vkalintiris](https://github.com/vkalintiris)) - Rename `struct avl` to `avl_element` and the `typedef` to `avl_t`. ([#10735](https://github.com/netdata/netdata/pull/10735), [@vkalintiris](https://github.com/vkalintiris)) - Replace Google Analytics with PostHog for backend telemetry events. ([#10636](https://github.com/netdata/netdata/pull/10636), [@andrewm4894](https://github.com/andrewm4894)) - Skip C++ incompatible header in main libnetdata header. ([#10737](https://github.com/netdata/netdata/pull/10737), [@vkalintiris](https://github.com/vkalintiris)) - Try to keep all pages from extents read from disk in the cache. ([#10558](https://github.com/netdata/netdata/pull/10558), [@mfundul](https://github.com/mfundul)) - Use a parameter name that is not a reserved keyword in C++. ([#10738](https://github.com/netdata/netdata/pull/10738), [@vkalintiris](https://github.com/vkalintiris)) - Use of out-of-line struct definitions. ([#10739](https://github.com/netdata/netdata/pull/10739), [@vkalintiris](https://github.com/vkalintiris)) ## Dashboard - Add `max` value to the `nvidia_smi.fan_speed` gauge. ([#10780](https://github.com/netdata/netdata/pull/10780), [@ilyam8](https://github.com/ilyam8)) - Add state map to duplex and operstate charts. ([#10752](https://github.com/netdata/netdata/pull/10752), [@vlvkobal](https://github.com/vlvkobal)) - Add supervisord to `dashboard_info.js`. ([#10754](https://github.com/netdata/netdata/pull/10754), [@ilyam8](https://github.com/ilyam8)) - Fix broken external links. ([#10586](https://github.com/netdata/netdata/pull/10586), [@Habetdin](https://github.com/Habetdin)) - Make network state map syntax consistent in `dashboard_info.js`. ([#10849](https://github.com/netdata/netdata/pull/10849), [@ilyam8](https://github.com/ilyam8)) - dashboard@v2.13.28 ([#10761](https://github.com/netdata/netdata/pull/10761), [@jacekkolasa](https://github.com/jacekkolasa)) - Fix alarms log export. - Persist relative timeframe. - Allow multirow names in the replicated nodes list. - Fix the date & time picker overlap. - Update Font Awesome. - Truncate long names. - Update links: change `docs.netdata.cloud` to l`earn.netdata.cloud`. - Remove Google's GA & GTM completely, in favor of open-source PostHog. ## Health ### Bug fixes - Fix delaying CLEAR notifications when using the `repeat` feature. ([#10846](https://github.com/netdata/netdata/pull/10846), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix wrong count of entries in the `alarm.log`. ([#10564](https://github.com/netdata/netdata/pull/10564), [@thiagoftsm](https://github.com/thiagoftsm)) ### Alarms - Add `wmi_` prefix to the wmi collector network alarms. ([#10782](https://github.com/netdata/netdata/pull/10782), [@ilyam8](https://github.com/ilyam8)) - Add collector prefix to the external collectors alarms. ([#10830](https://github.com/netdata/netdata/pull/10830), [@ilyam8](https://github.com/ilyam8)) - Apply adapter_raid alarms for every logical/physical device. ([#10820](https://github.com/netdata/netdata/pull/10820), [@ilyam8](https://github.com/ilyam8)) - Apply megacli alarms for every adapter/physical disk. ([#10834](https://github.com/netdata/netdata/pull/10834), [@ilyam8](https://github.com/ilyam8)) - Exclude cgroups network interfaces from packets dropped alarms. ([#10806](https://github.com/netdata/netdata/pull/10806), [@ilyam8](https://github.com/ilyam8)) - Fix various alarms critical and warning thresholds hysteresis. ([#10779](https://github.com/netdata/netdata/pull/10779), [@ilyam8](https://github.com/ilyam8)) - Improve alarms `info` fields. ([#10853](https://github.com/netdata/netdata/pull/10853), [@ilyam8](https://github.com/ilyam8)) - Make VerneMQ alarms less sensitive. ([#10770](https://github.com/netdata/netdata/pull/10770), [@ilyam8](https://github.com/ilyam8)) - Make alarms less sensitive. ([#10688](https://github.com/netdata/netdata/pull/10688), [@ilyam8](https://github.com/ilyam8)) - Remove `exporting_metrics_lost` template. ([#10829](https://github.com/netdata/netdata/pull/10829), [@ilyam8](https://github.com/ilyam8)) - Remove `ram_in_swap` alarm. ([#10789](https://github.com/netdata/netdata/pull/10789), [@ilyam8](https://github.com/ilyam8)) - Use separate `packets_dropped_ratio` alarms for wireless network interfaces. ([#10785](https://github.com/netdata/netdata/pull/10785), [@ilyam8](https://github.com/ilyam8)) ### Notifications - Add ability to change port number when using IRC notification method. ([#10824](https://github.com/netdata/netdata/pull/10824), [@RaitoBezarius](https://github.com/RaitoBezarius)) - Add `dump_methods` parameter to `alarm-notify.sh.in`. ([#10772](https://github.com/netdata/netdata/pull/10772), [@MrZammler](https://github.com/MrZammler)) - Log an error if there is a failure during an email alarm notification. ([#10818](https://github.com/netdata/netdata/pull/10818), [@ilyam8](https://github.com/ilyam8)) ## Collectors ### New - Add monitoring of synchronization system calls to the eBPF collector. ([#10814](https://github.com/netdata/netdata/pull/10814), [@thiagoftsm](https://github.com/thiagoftsm)) - Add monitoring of Linux page cache to the eBPF collector. ([#10693](https://github.com/netdata/netdata/pull/10693), [@thiagoftsm](https://github.com/thiagoftsm)) ### Improvements - Add `k6.conf` to the StatsD collector. ([#10733](https://github.com/netdata/netdata/pull/10733), [@OdysLam](https://github.com/OdysLam)) - Clean up the eBPF collector. ([#10680](https://github.com/netdata/netdata/pull/10680), [@thiagoftsm](https://github.com/thiagoftsm)) - Use working set for memory utilization in the cgroups collector. ([#10712](https://github.com/netdata/netdata/pull/10712), [@vlvkobal](https://github.com/vlvkobal)) - Add new configuration parameters to the example Python collector. ([#10777](https://github.com/netdata/netdata/pull/10777), [@andrewm4894](https://github.com/andrewm4894)) - Add carrier and MTU charts for network interfaces. ([#10866](https://github.com/netdata/netdata/pull/10866), [@vlvkobal](https://github.com/vlvkobal)) - Improve email, cron, and UPS groups in the `apps.plugin` configuration. ([#9313](https://github.com/netdata/netdata/pull/9313), [@Steve8291](https://github.com/Steve8291)) - Add Wireguard to the `vpn` group in the `apps.plugin` configuration. ([#10743](https://github.com/netdata/netdata/pull/10743), [@liepumartins](https://github.com/liepumartins)) - Add alarm values collection to the Python alarms collector. ([#10675](https://github.com/netdata/netdata/pull/10675), [@andrewm4894](https://github.com/andrewm4894)) - Add `attribute 233` (Media Wearout Indicator (SSD)) collection to the python smartd_log collector. ([#10711](https://github.com/netdata/netdata/pull/10711), [@aazedo](https://github.com/aazedo)) - Move network interface speed, duplex, and operstate variables to charts. ([#10740](https://github.com/netdata/netdata/pull/10740), [@vlvkobal](https://github.com/vlvkobal)) - Update `go.d.plugin` version to v0.28.1. ([#10826](https://github.com/netdata/netdata/pull/10826), [@ilyam8](https://github.com/ilyam8)) - Add a `noauthcodecheck` workaround flag to the freeipmi collector. ([#10701](https://github.com/netdata/netdata/pull/10701), [@vlvkobal](https://github.com/vlvkobal)) ### Bug fixes - Fix eBPF collector compatibility with kernels v5.11+. ([#10707](https://github.com/netdata/netdata/pull/10707), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix disks identification in the diskstats collector. ([#10843](https://github.com/netdata/netdata/pull/10843), [@vlvkobal](https://github.com/vlvkobal)) - Fix the count of `cpuset.cpus` in the cgroups collector. ([#10757](https://github.com/netdata/netdata/pull/10757), [@ilyam8](https://github.com/ilyam8)) - Fix disk utilization and backlog charts in the diskstats collector. ([#10705](https://github.com/netdata/netdata/pull/10705), [@vlvkobal](https://github.com/vlvkobal)) ## Exporting ### Bug fixes - Fix adding `duplicate _total` suffixes for the Prometheus collector. ([#10674](https://github.com/netdata/netdata/pull/10674), [@vlvkobal](https://github.com/vlvkobal)) ## Packaging and installation - Add JSON output option for `buildinfo`. ([#10706](https://github.com/netdata/netdata/pull/10706), [@Ferroin](https://github.com/Ferroin)) - Add information about the `--aclk-ng` option to the netdata-installer script. ([#10852](https://github.com/netdata/netdata/pull/10852), [@underhood](https://github.com/underhood)) - Add support for claiming nodes as part of installation. ([#10084](https://github.com/netdata/netdata/pull/10084), [@Ferroin](https://github.com/Ferroin)) - Assorted updater fixes ([#10613](https://github.com/netdata/netdata/pull/10613), [@Ferroin](https://github.com/Ferroin)) - Fix claiming via environment variables in a Docker container ([#10811](https://github.com/netdata/netdata/pull/10811), [@ilyam8](https://github.com/ilyam8)) - Fix detection of already claimed node in Docker images ([#10720](https://github.com/netdata/netdata/pull/10720), [@Ferroin](https://github.com/Ferroin)) - Fix handling of perf.plugin capabilities ([#10766](https://github.com/netdata/netdata/pull/10766), [@Ferroin](https://github.com/Ferroin)) - Fix handling of permissions for some plugins ([#10490](https://github.com/netdata/netdata/pull/10490), [@Ferroin](https://github.com/Ferroin)) ## Documentation - Add guide: _Develop a custom data collector for Netdata in Python_. ([#10710](https://github.com/netdata/netdata/pull/10710), [@joelhans](https://github.com/joelhans)) - Add guide: _LAMP stack monitoring_. ([#10698](https://github.com/netdata/netdata/pull/10698), [@joelhans](https://github.com/joelhans)) - Add guide: _Unsupervised anomaly detection for Raspberry Pi monitoring_. ([#10713](https://github.com/netdata/netdata/pull/10713), [@joelhans](https://github.com/joelhans)) - Add guide: _How to use any StatsD data source with Netdata_. ([#10719](https://github.com/netdata/netdata/pull/10719), [@OdysLam](https://github.com/OdysLam)) - Convert references to `service` to `systemctl`. ([#10703](https://github.com/netdata/netdata/pull/10703), [@joelhans](https://github.com/joelhans)) - Fix broken link in StatsD guide. ([#10831](https://github.com/netdata/netdata/pull/10831), [@joelhans](https://github.com/joelhans)) - Fix broken links in active alarms doc. ([#10678](https://github.com/netdata/netdata/pull/10678), [@joelhans](https://github.com/joelhans)) - Improve the Kubernetes deployment documentation. ([#10662](https://github.com/netdata/netdata/pull/10662), [@joelhans](https://github.com/joelhans)) - Revamp StatsD docs. ([#10637](https://github.com/netdata/netdata/pull/10637), [@OdysLam](https://github.com/OdysLam)) - Update guide: _Kubernetes monitoring with Netdata: Overview and visualizations_. ([#10691](https://github.com/netdata/netdata/pull/10691), [@joelhans](https://github.com/joelhans)) - Update screenshots and text for new Cloud navigation. ([#10664](https://github.com/netdata/netdata/pull/10664), [@joelhans](https://github.com/joelhans)) - Comment out `memory mode` mention in StatsD example. ([#10751](https://github.com/netdata/netdata/pull/10751), [@OdysLam](https://github.com/OdysLam)) - Fix a typo in the email notifications doc. ([#10668](https://github.com/netdata/netdata/pull/10668), [@ossimantylahti](https://github.com/ossimantylahti)) - Update syntax for Caddy v2. ([#10823](https://github.com/netdata/netdata/pull/10823), [@salazarp](https://github.com/salazarp)) ## Bug fixes - Fix a typo in `main.h`. ([#10858](https://github.com/netdata/netdata/pull/10858), [@eltociear](https://github.com/eltociear)) - Fix a typo in `backend_prometheus.c`. ([#10716](https://github.com/netdata/netdata/pull/10716), [@eltociear](https://github.com/eltociear)) - Fix a typo in `dashboard_info.js`. ([#10775](https://github.com/netdata/netdata/pull/10775), [@eltociear](https://github.com/eltociear)) - Fix segfault due to misalignment between global and StatsD memory modes. ([#10732](https://github.com/netdata/netdata/pull/10732), [@stelfrag](https://github.com/stelfrag)) - Fix zombie alarms for charts that are obsolete/removed. ([#10804](https://github.com/netdata/netdata/pull/10804), [@vlvkobal](https://github.com/vlvkobal)) - Fix a Coverity warning in the new MQTT library. ([#10851](https://github.com/netdata/netdata/pull/10851), [@underhood](https://github.com/underhood)) - Fix a parameter binding issue when storing chart names in the database. ([#10717](https://github.com/netdata/netdata/pull/10717), [@stelfrag](https://github.com/stelfrag)) - Fix crash when executing data query with context and non-existing `chart_label_key`. ([#10844](https://github.com/netdata/netdata/pull/10844), [@stelfrag](https://github.com/stelfrag)) - Fix claiming behind Squid proxy. ([#10734](https://github.com/netdata/netdata/pull/10734), [@underhood](https://github.com/underhood)) - Fix Coverity issue (CID 367566). ([#10813](https://github.com/netdata/netdata/pull/10813), [@stelfrag](https://github.com/stelfrag)) - Fix memory leak when archived data is requested. ([#10837](https://github.com/netdata/netdata/pull/10837), [@stelfrag](https://github.com/stelfrag)) - Fix clash with C++ standard definitions by changing `abs` to `ABS`. ([#10354](https://github.com/netdata/netdata/pull/10354), [@KickerTom](https://github.com/KickerTom)) 2021-03-31T12:30:55+00:00 netdata v1.30.1 netdata v1.30.1 2021-04-12T13:19:51+00:00 This is a patch release to address discovered issues since 1.30.0. ## Acknowledgments - [@jsoref](https://github.com/jsoref) for fixing numerous spelling mistakes. ## Documentation - Fix grammar in ACLK README.md. ([#10898](https://github.com/netdata/netdata/pull/10898), [@slimanio](https://github.com/slimanio)) - Update news and GIF in README, fix typo. ([#10900](https://github.com/netdata/netdata/pull/10900), [@joelhans](https://github.com/joelhans)) - Fix spelling mistakes in various places. ([#10428](https://github.com/netdata/netdata/pull/10428), [@jsoref](https://github.com/jsoref)) ## Packaging / Installation - Don’t use glob expansion in argument to `cd` in updater. ([#10936](https://github.com/netdata/netdata/pull/10936), [@Ferroin](https://github.com/Ferroin)) - Bumped version of OpenSSL bundled in static builds to 1.1.1k. ([#10884](https://github.com/netdata/netdata/pull/10884), [@Ferroin](https://github.com/Ferroin)) - Fix bundling of ACLK-NG components in dist tarballs. ([#10894](https://github.com/netdata/netdata/pull/10894), [@Ferroin](https://github.com/Ferroin)) ## Bug Fixes - Fix memory corruption issue when executing context queries in RAM/SAVE memory mode. ([#10933](https://github.com/netdata/netdata/pull/10933), [@stelfrag](https://github.com/stelfrag)) - Add a CRASH event when the agent fails to properly shutdown. ([#10893](https://github.com/netdata/netdata/pull/10893), [@stelfrag](https://github.com/stelfrag)) - Fix incorrect health log entries. ([#10822](https://github.com/netdata/netdata/pull/10822), [@stelfrag](https://github.com/stelfrag)) 2021-04-12T13:19:51+00:00 netdata v1.31.0 netdata v1.31.0 2021-05-19T12:21:44+00:00 The v1.31.0 release of Netdata comes with re-packaged and redesigned elements of the dashboard to help you focus on your metrics, even more Linux kernel insights via eBPF, on-node machine learning to help you find anomalies, and much more. This release contains 10 new collectors, 54 improvements (7 in the dashboard), 31 documentation updates, and 29 bug fixes. ## At a glance We **re-packaged and redesigned portions of the dashboard** to improve the overall experience. Part of this effort is better handling of dashboard code during installation—anyone using third-party packages (such as the Netdata Homebrew formula) will start seeing new features and the new designs starting today. The [timeframe picker](https://learn.netdata.cloud/docs/dashboard/select-timeframes) has moved to the top panel, and just to its right are two counters with live `CRITICAL` and `WARNING` alarm statuses for your node. Click on either of these two open the [alarms modal](https://learn.netdata.cloud/docs/monitor/view-active-alarms#local-netdata-agent-dashboard). We've also pushed a number of powerful new collectors, including **directory cache monitoring via [eBPF](https://learn.netdata.cloud/docs/agent/collectors/ebpf.plugin/)**. By monitoring directory cache, developers and SREs alike can find opportunities to optimize memory usage and reduce disk-intensive operations. Our new **[Z-scores](https://learn.netdata.cloud/docs/agent/collectors/python.d.plugin/zscores) and [changefinder](https://learn.netdata.cloud/docs/agent/collectors/python.d.plugin/changefinder) collectors** use machine learning to let you know, at a glance, when key metrics start to behave oddly. We'd love to get feedback on these sophisticated, subjective new brand of collectors! [Netdata Learn](https://learn.netdata.cloud/), our documentation and educational site, got some **refreshed visuals and an improved navigation tree** to help you find the right doc quickly. Hit `Ctrl/⌘ + k` to start a new search! ### Update now If you're not receiving automatic updates on your node(s), check our [update](https://learn.netdata.cloud/docs/agent/packaging/installer/update#determine-which-installation-method-you-used) doc for details. ## Acknowledgments - [@jsoref](https://github.com/jsoref) for fixing numerous spelling mistakes. - [@Steve8291](https://github.com/Steve8291) for improving plugins error logging on restart and documentation improvement. - [@vincentkersten](https://github.com/vincentkersten) for updating the nvidia-smi collector documentation. - [@Avre](https://github.com/Avre) for updating the install on cloud providers doc. - [@endreszabo](https://github.com/endreszabo) for adding renaming libvirtd LXC containers support. - [@RaitoBezarius](https://github.com/RaitoBezarius) for adding attribute 249 support to the `smartd_log` module. - [@Habetdin](https://github.com/Habetdin) for updating the `fping` version. - [@wangpei-nice](https://github.com/wangpei-nice) for fixing `.deb` and `.rpm` packaging of the eBPF plugin. - [@tiramiseb](https://github.com/tiramiseb) for improving the installation method for Alpine. - [@BastienBalaud](https://github.com/BastienBalaud) for upgrading the OKay repository for RHEL8. - [@tknobi](https://github.com/tknobi) for adding the Nextcloud plugin to the third-party collector list. - [@jilleJr](https://github.com/jilleJr) for adding IPv6 listen address example to the Nginx proxy doc. - [@cherouvim](https://github.com/cherouvim) for formatting and wording in the Apache proxy doc. - [@yavin87](https://github.com/yavin87) for fixing spelling in the infrastructure monitoring quickstart. - [@tnyeanderson](https://github.com/tnyeanderson) for improving `dash-example.html`. - [@tomcbe](https://github.com/tomcbe) for fixing Microsoft Teams notification method naming. - [@tnyeanderson](https://github.com/tnyeanderson) For improving the `dash-example` documentation. - [@diizzyy](https://github.com/diizzyy) for fixing a bug in the FreeBSD plugin. ## Improvements - Automatically trigger Helmchart PR on Agent release. ([#11084](https://github.com/netdata/netdata/pull/11084), [@Ferroin](https://github.com/Ferroin)) - Implement ACLK env endpoint. ([#10833](https://github.com/netdata/netdata/pull/10833), [@underhood](https://github.com/underhood)) - Implement new HTTPS client for ACLK. ([#10805](https://github.com/netdata/netdata/pull/10805), [@underhood](https://github.com/underhood)) - Update ACLK passwd endpoint to match specifications of the new architecture. ([#10859](https://github.com/netdata/netdata/pull/10859), [@underhood](https://github.com/underhood)) - Implement ACLK new backoff (TBEB) architecture. ([#10941](https://github.com/netdata/netdata/pull/10941), [@underhood](https://github.com/underhood)) - Add functionality to store `node_id` for a host. ([#11059](https://github.com/netdata/netdata/pull/11059), [@stelfrag](https://github.com/stelfrag)) - Remove version negotiation from ACLK-NG .([#10980](https://github.com/netdata/netdata/pull/10980), [@underhood](https://github.com/underhood)) - Persist claim IDs in local database for parent and children. ([#10993](https://github.com/netdata/netdata/pull/10993), [@stelfrag](https://github.com/stelfrag)) - Provide more agent analytics to PostHog. ([#11020](https://github.com/netdata/netdata/pull/11020), [@MrZammler](https://github.com/MrZammler)) - Reduce logging when sending agent analytics. ([#11091](https://github.com/netdata/netdata/pull/11091), [@MrZammler](https://github.com/MrZammler)) - Remove error message on Netdata restart. ([#8685](https://github.com/netdata/netdata/pull/8685), [@Steve8291](https://github.com/Steve8291)) - Add a timeout when sending anonymous statistics using `curl`. ([#11010](https://github.com/netdata/netdata/pull/11010), [@ilyam8](https://github.com/ilyam8)) - Improve `dash-example.html`. ([#10870](https://github.com/netdata/netdata/pull/10870), [@tnyeanderson](https://github.com/tnyeanderson)) - Add `host_cloud_enabled` attribute to analytics. ([#11100](https://github.com/netdata/netdata/pull/11100), [@MrZammler](https://github.com/MrZammler)) ## Dashboard - Bundle the react dashboard code into the agent repo directly. ([#11139](https://github.com/netdata/netdata/pull/11139), [@Ferroin](https://github.com/Ferroin)) - Add dashboard info strings for `systemdunits` collector. ([#10904](https://github.com/netdata/netdata/pull/10904), [@ilyam8](https://github.com/ilyam8)) - Update dashboard version to v2.17.0. ([#10856](https://github.com/netdata/netdata/pull/10856), [@allelos](https://github.com/allelos)) - Top bar, side panel and overall navigation has been redesigned. - Top bar now includes a light bulb icon with news/features and the number of `CRITICAL` or `WARNING` alarms. - Documentation and settings buttons moved to the sidebar. - Improved rendering of sign in/sign up option button along with an operational status option (under user settings). - In the left panel, nodes show a status badge and are now searchable if there are more than 4. ## Health ### Improvements - Add `charts` configuration option to templates. ([#11054](https://github.com/netdata/netdata/pull/11054), [@thiagoftsm](https://github.com/thiagoftsm)) - Add new attributes to health configuration files. ([#10961](https://github.com/netdata/netdata/pull/10961), [@MrZammler](https://github.com/MrZammler)) - Add `inconsistent` state to the `mysql_galera_cluster_state` alarm. ([#10945](https://github.com/netdata/netdata/pull/10945), [@ilyam8](https://github.com/ilyam8)) - Add systemdunits collector alarms. ([#10906](https://github.com/netdata/netdata/pull/10906), [@ilyam8](https://github.com/ilyam8)) - Use `average` instead of `sum` in VerneMQ alarms. ([#11037](https://github.com/netdata/netdata/pull/11037), [@ilyam8](https://github.com/ilyam8)) - Check configuration for `CUSTOM` and `MSTEAM`. ([#11113](https://github.com/netdata/netdata/pull/11113), [@MrZammler](https://github.com/MrZammler)) - Reduce alarms notifications dump logging. ([#11116](https://github.com/netdata/netdata/pull/11116), [@ilyam8](https://github.com/ilyam8)) ### Bug fixes - Add `synchronization.conf` to the Makefile. ([#10907](https://github.com/netdata/netdata/pull/10907), [@ilyam8](https://github.com/ilyam8)) - Fix Microsoft Teams naming. ([#9905](https://github.com/netdata/netdata/pull/9905), [@tomcbe](https://github.com/tomcbe)) ## Collectors ### New - Add a chart for out of memory kills. ([#10880](https://github.com/netdata/netdata/pull/10880), [@vlvkobal](https://github.com/vlvkobal)) - Add a chart with Netdata uptime. ([#10997](https://github.com/netdata/netdata/pull/10997), [@vlvkobal](https://github.com/vlvkobal)) - Add a module for ZFS pool state. ([#11071](https://github.com/netdata/netdata/pull/11071), [@vlvkobal](https://github.com/vlvkobal)) - Add a plugin for the system clock synchronization state. ([#10895](https://github.com/netdata/netdata/pull/10895), [@vlvkobal](https://github.com/vlvkobal)) - Add new charts for extended disk metrics. ([#10939](https://github.com/netdata/netdata/pull/10939), [@vlvkobal](https://github.com/vlvkobal)) - Add support for renaming libvirtd LXC containers. ([#11006](https://github.com/netdata/netdata/pull/11006), [@endreszabo](https://github.com/endreszabo)) - Add a metric for Percpu memory. ([#10964](https://github.com/netdata/netdata/pull/10964), [@vlvkobal](https://github.com/vlvkobal)) - Add an eBPF directory cache collector. ([#10855](https://github.com/netdata/netdata/pull/10855), [@thiagoftsm](https://github.com/thiagoftsm)) - Add a Z-scores python collector. ([#10673](https://github.com/netdata/netdata/pull/10673), [@andrewm4894](https://github.com/andrewm4894)) - Add changefinder python collector. ([#10672](https://github.com/netdata/netdata/pull/10672), [@andrewm4894](https://github.com/andrewm4894)) ### Improvements - Remove dots in cgroup IDs. ([#11050](https://github.com/netdata/netdata/pull/11050), [@vlvkobal](https://github.com/vlvkobal)) - Add support for attribute 249 (NAND Writes 1GiB) to the `smartd_log` module. ([#10872](https://github.com/netdata/netdata/pull/10872), [@RaitoBezarius](https://github.com/RaitoBezarius)) - Add RAID level to the `mdstat` collector chart families. ([#11024](https://github.com/netdata/netdata/pull/11024), [@ilyam8](https://github.com/ilyam8)) - Update `fping` version. ([#10977](https://github.com/netdata/netdata/pull/10977), [@Habetdin](https://github.com/Habetdin)) - Add plugin and module names to the `python.d.plugin` runtime charts. ([#11007](https://github.com/netdata/netdata/pull/11007), [@ilyam8](https://github.com/ilyam8)) - Move global stats to a separate thread .([#10991](https://github.com/netdata/netdata/pull/10991), [@vlvkobal](https://github.com/vlvkobal)) - Add memory size adjustments for eBPF hash tables. ([#10962](https://github.com/netdata/netdata/pull/10962), [@thiagoftsm](https://github.com/thiagoftsm)) - Add improvements to anomalies collector. ([#11003](https://github.com/netdata/netdata/pull/11003), [@andrewm4894](https://github.com/andrewm4894)) - Add support for loading of `kprobe` names in the eBPF plugin. ([#11034](https://github.com/netdata/netdata/pull/11034), [@thiagoftsm](https://github.com/thiagoftsm)) - Don't repeat the `cgroup` discovery cleanup info message. ([#11101](https://github.com/netdata/netdata/pull/11101), [@vlvkobal](https://github.com/vlvkobal)) - Change ACLK statistics charts units from kB/s to KiB/s. ([#11103](https://github.com/netdata/netdata/pull/11103), [@ilyam8](https://github.com/ilyam8)) ### Bug fixes - Use `size_t` instead of int for `vfs_bufspace_count` in FreeBSD plugin. ([#11142](https://github.com/netdata/netdata/pull/11142), [@diizzyy](https://github.com/diizzyy)) - Fix the detection of cgroups v2 by checking the version of the default cgroup mountpoint. ([#11102](https://github.com/netdata/netdata/pull/11102), [@vlvkobal](https://github.com/vlvkobal)) - Fix eBPF cachestat chart type. ([#11074](https://github.com/netdata/netdata/pull/11074), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix gaps in eBPF cachestat charts. ([#10972](https://github.com/netdata/netdata/pull/10972), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix detection of `opensipsctl` executable. ([#10978](https://github.com/netdata/netdata/pull/10978), [@ilyam8](https://github.com/ilyam8)) - Fix network interfaces detection when using `virsh`. ([#11096](https://github.com/netdata/netdata/pull/11096), [@ilyam8](https://github.com/ilyam8)) - Fix eBPF plugin crash during shutdown. ([#10957](https://github.com/netdata/netdata/pull/10957), [@thiagoftsm](https://github.com/thiagoftsm)) ## Exporting ### Improvements - Allow the remote write configuration to have multiple destinations ([#11005](https://github.com/netdata/netdata/pull/11005), [@vlvkobal](https://github.com/vlvkobal)) ### Bug fixes - Fix backend chart filtering backward compatibility. ([#11002](https://github.com/netdata/netdata/pull/11002), [@vlvkobal](https://github.com/vlvkobal)) ## Packaging and installation - Fix `.deb` and `.rpm` packaging of the eBPF plugin. ([#11031](https://github.com/netdata/netdata/pull/11031), [@wangpei-nice](https://github.com/wangpei-nice)) - Upgrade OKay repository for RHEL8. ([#10973](https://github.com/netdata/netdata/pull/10973), [@BastienBalaud](https://github.com/BastienBalaud)) - Support mulitple jobs in `make(1)` when building LWS. ([#10799](https://github.com/netdata/netdata/pull/10799), [@vkalintiris](https://github.com/vkalintiris)) - Build `mqtt_websockets` with Netdata autotools. ([#11083](https://github.com/netdata/netdata/pull/11083), [@underhood](https://github.com/underhood)) ## Documentation - Improve dashboard documentation (part 3). ([#11099](https://github.com/netdata/netdata/pull/11099), [@joelhans](https://github.com/joelhans)) - Fix broken link in dimensions/contexts/families doc. ([#11148](https://github.com/netdata/netdata/pull/11148), [@joelhans](https://github.com/joelhans)) - Add info on other memory modes to `performance.md`. ([#11144](https://github.com/netdata/netdata/pull/11144), [@cakrit](https://github.com/cakrit)) - Remove dash-example, place in community repo. ([#11077](https://github.com/netdata/netdata/pull/11077), [@tnyeanderson](https://github.com/tnyeanderson)) - Update `k6.md`. ([#11127](https://github.com/netdata/netdata/pull/11127), [@OdysLam](https://github.com/OdysLam)) - Fix broken links in various docs. ([#11109](https://github.com/netdata/netdata/pull/11109), [@joelhans](https://github.com/joelhans)) - Fix broken link in doc. ([#11122](https://github.com/netdata/netdata/pull/11122), [@forest0](https://github.com/forest0)) - Add third-party collector: Nextcloud plugin. ([#11032](https://github.com/netdata/netdata/pull/11032), [@tknobi](https://github.com/tknobi)) - Add IPv6 listen address example to the Nginx proxy doc. ([#10473](https://github.com/netdata/netdata/pull/10473), [@jilleJr](https://github.com/jilleJr)) - Add documentation for claiming during kickstart installation. ([#11052](https://github.com/netdata/netdata/pull/11052), [@joelhans](https://github.com/joelhans)) - Add lists of monitored metrics to the `cgroups` plugin documentation. ([#10924](https://github.com/netdata/netdata/pull/10924), [@vlvkobal](https://github.com/vlvkobal)) - Adds --recursive to Git clones. ([#10932](https://github.com/netdata/netdata/pull/10932), [@underhood](https://github.com/underhood)) - Fix formatting and wording in the Apache proxy doc. ([#8706](https://github.com/netdata/netdata/pull/8706), [@cherouvim](https://github.com/cherouvim)) - Fix spelling in the infrastructure monitoring doc. ([#11082](https://github.com/netdata/netdata/pull/11082), [@yavin87](https://github.com/yavin87)) - Improve dashboard documentation (part 1). ([#11015](https://github.com/netdata/netdata/pull/11015), [@joelhans](https://github.com/joelhans)) - Improve dashboard documentation (part 2). ([#11065](https://github.com/netdata/netdata/pull/11065), [@joelhans](https://github.com/joelhans)) - Improve get started/installation docs. ([#10995](https://github.com/netdata/netdata/pull/10995), [@joelhans](https://github.com/joelhans)) - Improve installation method for Alpine. ([#11035](https://github.com/netdata/netdata/pull/11035), [@tiramiseb](https://github.com/tiramiseb)) - Improve StatsD+K6 documentation. ([#10985](https://github.com/netdata/netdata/pull/10985), [@OdysLam](https://github.com/OdysLam)) - Overhaul streaming documentation. ([#10709](https://github.com/netdata/netdata/pull/10709), [@joelhans](https://github.com/joelhans)) - Remove RewriteEngine for dedicated vHost. ([#10873](https://github.com/netdata/netdata/pull/10873), [@Steve8291](https://github.com/Steve8291)) - Remove links to old install doc. ([#11014](https://github.com/netdata/netdata/pull/11014), [@joelhans](https://github.com/joelhans)) - Remove outdated privacy policy and terms of use. ([#10979](https://github.com/netdata/netdata/pull/10979), [@joelhans](https://github.com/joelhans)) - Replace references to Google Analytics with Posthog where relevant. ([#10868](https://github.com/netdata/netdata/pull/10868), [@andrewm4894](https://github.com/andrewm4894)) - Fix a typo in the cloud providers installation doc. ([#10942](https://github.com/netdata/netdata/pull/10942), [@Avre](https://github.com/Avre)) - Update eBPF documentation with new eBPF configuration filename and directory. ([#10982](https://github.com/netdata/netdata/pull/10982), [@thiagoftsm](https://github.com/thiagoftsm)) - Update web server options for respecting browser DNT. ([#10157](https://github.com/netdata/netdata/pull/10157), [@joelhans](https://github.com/joelhans)) - Fix spelling in StatsD guide. ([#10975](https://github.com/netdata/netdata/pull/10975), [@OdysLam](https://github.com/OdysLam)) - Clarify which health configuration entities are required. ([#11086](https://github.com/netdata/netdata/pull/11086), [@ilyam8](https://github.com/ilyam8)) - Fix alarm line `options` syntax in the docs. ([#10974](https://github.com/netdata/netdata/pull/10974), [@ilyam8](https://github.com/ilyam8)) - Update the `nvidia-smi` collector documentation. ([#10214](https://github.com/netdata/netdata/pull/10214), [@vincentkersten](https://github.com/vincentkersten)) ## Bug fixes - Reduce the number of ACLK chart updates during chart obsoletion. ([#11133](https://github.com/netdata/netdata/pull/11133), [@stelfrag](https://github.com/stelfrag)) - Fix SSL random failures when using multithreaded web server with OpenSSL < `1.1.0`. ([#11089](https://github.com/netdata/netdata/pull/11089), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix storing an `NULL` claim ID on a parent node. ([#11036](https://github.com/netdata/netdata/pull/11036), [@stelfrag](https://github.com/stelfrag)) - Prevent MQTT connection attempt on OTP failure. ([#10839](https://github.com/netdata/netdata/pull/10839), [@underhood](https://github.com/underhood)) - Rename struct fields from class to classification. ([#11019](https://github.com/netdata/netdata/pull/11019), [@vkalintiris](https://github.com/vkalintiris)) - Fix spelling mistakes in various components: - aclk ([#10910](https://github.com/netdata/netdata/pull/10910), [@jsoref](https://github.com/jsoref)) - build ([#10909](https://github.com/netdata/netdata/pull/10909), [@jsoref](https://github.com/jsoref)) - collectors ([#10912](https://github.com/netdata/netdata/pull/10912), [@jsoref](https://github.com/jsoref)) - daemon ([#10913](https://github.com/netdata/netdata/pull/10913), [@jsoref](https://github.com/jsoref)) - database ([#10914](https://github.com/netdata/netdata/pull/10914), [@jsoref](https://github.com/jsoref)) - exporting ([#10915](https://github.com/netdata/netdata/pull/10915), [@jsoref](https://github.com/jsoref)) - libnetdata ([#10917](https://github.com/netdata/netdata/pull/10917), [@jsoref](https://github.com/jsoref)) - health ([#10916](https://github.com/netdata/netdata/pull/10916), [@jsoref](https://github.com/jsoref)) - streaming ([#10919](https://github.com/netdata/netdata/pull/10919), [@jsoref](https://github.com/jsoref)) - tests ([#10920](https://github.com/netdata/netdata/pull/10920), [@jsoref](https://github.com/jsoref)) - backend ([#10911](https://github.com/netdata/netdata/pull/10911), [@jsoref](https://github.com/jsoref)) - bidirectional ([#10918](https://github.com/netdata/netdata/pull/10918), [@jsoref](https://github.com/jsoref)) - HTTP API ([#10921](https://github.com/netdata/netdata/pull/10921), [@jsoref](https://github.com/jsoref)) - web ([#10922](https://github.com/netdata/netdata/pull/10922), [@jsoref](https://github.com/jsoref)) 2021-05-19T12:21:44+00:00 netdata v1.32.0 netdata v1.32.0 2021-11-30T19:50:51+00:00 # Release v1.32.0 The newest version of Netdata, v.1.32.0, propels us toward the end of the year, and the Netdata community is positioned to grow stronger than ever in 2022. Before we get into specifics of the new release, it's worth reflecting on that growth. ### Netdata open-source Agent growth The open-source Netdata Agent, the best OSS node monitoring and troubleshooting ever, currently has: - 1,000,000 unique Netdata nodes live! - 330,000 engineers using the agent per month! - Our open-source community growing at an amazing rate, with 3,000 new nodes and 8,000 users per day! - 250,000 Docker pulls per day with 360 million total, according to DockerHub! ### Netdata Cloud growth The Netdata Cloud, our infrastructure-level, distributed, real-time monitoring and troubleshooting orchestrator, is also showing similar growth, with: - 35,000 live Netdata nodes! - 90,000 engineers signed up with 200 new sign-ups every day! - 180 new spaces created every day! We are not just pleased with this amazing adoption rate, we are inspired by it. It is you users who give us the energy and confidence to move forward into a new era of high-fidelity, real-time monitoring and troubleshooting, made accessible to everyone! Thank you for the inspiration! You rock! ### Community News As many of you know, even though we are not endorsed by CNCF, Netdata is the fourth most starred project in the CNCF landscape. We want to thank you for this expression of your appreciation. If you love Netdata and haven't yet, consider giving us a [Github star](https://github.com/netdata). Additionally, we invite you to join us on our new [Discord server](https://discord.gg/mP7VD76Y) to continue our growth and trajectory, but also to join in on fun and informative live conversations with our wonderful community. ## v1.32.0 at a glance The following offers a high-level overview of some of the key changes made in this release, with more detailed description available in subsequent sections. **New Cloud backend and Agent communication protocol** This Agent release supports our new Cloud backend. From here, we will be offering much faster and simpler communication, reliable alerts and exchange of metadata, and first-time support for the parent-child relationship of Netdata agents. This is the first Agent release that allows Netdata Cloud to use the Netdata Agent as a distributed time-series database that supports replication and query routing, for every metric! **eBPF latency monitoring, container monitoring, and more** We use eBPF to monitor all running processes, without the cooperation of the processes and without sniffing data traffic. This new release includes 13 new eBPF monitoring features, including I/O latency, BTRFS, EXT4, NFS, XFS and ZFS latencies, IRQs latencies, extended swap monitoring, and more. **Machine learning (ML) powered anomaly detection** ​This release links Netdata Agent with [dlib](https://github.com/davisking/dlib), the popular C++ machine learning algorithms library, which we use to automatically detect anomalies out-of-the-box, at the edge! Once enabled, Netdata trains an ML model for every metric, which is then used to detect outliers in real-time. The resulting "anomaly bit" (where 0=normal, 1=anomalous) associated with each database entry is stored alongside the raw metric value with zero additional storage overhead! This feature is still in development, so it is disabled by default. If you would like to test it and provide feedback, you can enable the feature using the instructions provided in the **Detailed release highlights** section. **New timezone selector and time controls in the user interface** We implemented a new timezone picker and time controls to enhance administrative abilities in the dashboard. **Docker image POWER8+ support** Netdata Docker images now support recent IBM Power Systems, Raptor Talos II, and more. **And more...** Four new collectors, 112 total improvements, 95 bug fixes, 49 documentation updates, and 57 packaging and installation changes! ## Detailed release highlights ### New Cloud backend and Agent communication protocol It's no secret that the best of Netdata Cloud is yet to come. After several months of developing, testing, and benchmarking a new architectural system, we have steadied ourselves for that growth. These changes should offer notable and immediate improvements in reliability and stability, but more importantly, they allow us to quickly and efficiently develop new features and enhanced functionality. Here's what you can look for on the short-term horizon, thanks to our new architecture: - Greater capacity: The new architecture will change the communication protocol between the Agent and the Cloud to be incremental, improving our agent-handling capacity by ensuring that the Cloud uses measurably less bandwidth. - Parent/child relationships: The new architecture will allow, for the first time, the recognition of parent child relationships in the Cloud. These changes will enable you to change storage configuration on parents, limit sent metrics, and reduce data frequency to achieve a longer data retention for your nodes. Atop of this, we will continue to develop the ability for you to have complex setups to scale your monitoring with parents as proxies. Ultimately, this will enable Netdata to operate as a headless connector with the lowest footprint possible on your production nodes. - Alerts: The new architecture will host a multitude of improvements on our alerts presentation over the coming months, allowing for enhanced reliability, alert management, alert logs to be collected in the Cloud, and more. If you would like to be among the first to test this new architecture and provide feedback, first make sure that you have installed the latest Netdata version following [our guide](https://learn.netdata.cloud/docs/get-started/). Then, follow our instructions for [enabling the new architecture](https://learn.netdata.cloud/docs/cloud/beta-architecture/new-architecture#enabling-the-new-architecture). ### eBPF container monitoring We did a lot of work to enhance our eBPF container monitoring this release. First, we start with the development of full eBPF support for cgroups. As a refresher on just how important this update is: cgroups together with Namespaces are the building blocks for containers, which is the dominant way of distributing monitoring applications. We use cgroups to control how much of a given key resource (CPU, memory, network, and disk I/O) can be accessed or used by a process or set of processes. Our eBPF collector now creates charts for each cgroup, which enables us to understand how a specific cgroup interacts with the Linux kernel! 🤓 This enhances our already extensive monitoring by including cgroups for mem, process, network, file access, and [more](https://learn.netdata.cloud/docs/agent/collectors/ebpf.plugin#integration-dashboard-elements). ### eBPF latency monitoring By enabling eBPF monitoring on all systems that support it, Netdata has already been established as a world-leading distributor of eBPF! We use eBPF to monitor all running processes, without the cooperation of the processes, by tracking any way the application interfaces with the system. And in this release, we continue our commitment to further improve eBPF by tracking latencies by disks, IRQs, etc. Our new eBPF latency features include: - A new set of Disk I/O latency charts, which monitor the time that it takes for an I/O request to complete. As many of you may know, this is the most important metric for storage performance! - Latency IRQs monitoring to help anyone with time spent servicing interrupts (hard or soft). - A new **Filesystem** submenu that adds latency monitoring for different filesystems: BTRFS, Ext4, NFS, XFS and ZFS. The latency monitoring was brought for the most common functions, like latency for each open request and latency for each sync request. eBPF is a very strong addition to our monitoring tools, and we are committed to provide the best experience with monitoring with eBPF from a distance without disrupting the data flow! ### Other eBPF enhancements But we didn't stop there with eBPF in v1.32.0. We also provided the following updates: - We moved VFS to a **Filesystem** menu to simplify the visualization of events realized by filesystems. This allows you to monitor actions of filesystems and their latency. - Until now, Netdata had metrics that demonstrated the amount of swap usage. eBPF.plugin now extends the swap monitoring to show how a specific application group/cgroup is performing action on SWAP. - We have improved process management monitoring by adding monitoring to shared memory and using tracepoints to monitor process creation and exit with more accuracy. - Netdata also brings monitoring for OOM Kill events for each [apps groups defined on host](https://learn.netdata.cloud/docs/agent/collectors/apps.plugin#configuration). If you share our interest in eBPF monitoring, or have questions or requests, feel free to drop by our [Community forum](https://community.netdata.cloud) to start a discussion with us. ### Machine learning (ML) powered anomaly detection Machine learning (ML) is undeniably a wave of the future in monitoring and troubleshooting. The Netdata community is riding that wave forward together, ahead of everyone else. Netdata v.1.32.0 introduces some foundational capabilities for ML-driven anomaly detection in the agent. We have integrated the popular [dlib](https://github.com/davisking/dlib) c++ ml library to power unsupervised anomaly detection out-of-the-box. While this functionality is still under development and subject to change, we want to develop this with you, as a team. The functionality is disabled by default while we dogfood the feature internally and build additional ML-leveraging features into Netdata Cloud. But you can go to the new `[ml]` section in `netdata.conf` and set `enabled=yes` to turn on anomaly detection. After restarting Netdata, you should see the **Anomaly Detection** menu with charts highlighting the overall number and percent of anomalous metrics on your node. This can be a very useful single number summary of the state of your node. Share your feedback by emailing us at analytics-ml-team@netdata.cloud or just come hang out in the [🤖-ml-powered-monitoring channel](https://discord.gg/4eRSEUpJnc) of our discord, where we discuss all things ML and more! And then, be on the lookout for some bigger announcements and launches relating to ML over the next couple of months. ### New timezone selector and time controls in the user interface Collaborating in a remote world across regions can be difficult, so we wanted to make it easier for you to sync with your administrative teams and your system information. Our new [timezone selector](https://learn.netdata.cloud/docs/dashboard/visualization-date-and-time-controls#timezone-selector) allows you to select a timezone to accommodate collaboration needs within your teams and infrastructure. Additionally, we have added the following [time controls](https://learn.netdata.cloud/docs/dashboard/visualization-date-and-time-controls#time-controlsto) to allow you to distinguish if the content you are looking at is live or historical and to refresh the content of the page when the tabs are in the background: - **Play**: When this option is selected, the content of the page will be automatically refreshed while this is in the foreground. - **Pause**: When this option is selected, the content of the page will not refresh due to a manual request to pause it or, for example, when you are investigating data on a chart (cursor is on top of a chart) - **Force Play**: When this option is selected, the content of the page will be automatically refreshed even if this is in the background. ### Docker image POWER8+ support And on top of all of that, we have added 64-bit little-endian POWER8+ support to our official Docker images, allowing the use of Netdata Docker images on recent IBM Power Systems, Raptor Talos II, and similar POWER based hardware, extending the list of what is currently supported for our Docker images, which includes: - 32 and 64 bit x86 - ARMv7 - AArch64 ## Acknowledgments - [@nabijaczleweli](https://github.com/nabijaczleweli) for fixing writing updater log under root. - [@MikaelUrankar](https://github.com/MikaelUrankar) for fixing calculation of sysctl mib size in freebsd plugin. - [@filip-plata](https://github.com/filip-plata) for adding additional metrics to python.d/postgres collector. - [@eltociear](https://github.com/eltociear) for fixing typos. - [@gotjoshua](https://github.com/gotjoshua) for adding a link to python.d/httpcheck.conf. - [@wangpei-nice](https://github.com/wangpei-nice) for fixing ebpf.plugin segfault when ebpf_load_program returns null pointer. - [@zanechua](https://github.com/zanechua) for adding Microsoft Teams to supported notification endpoints. - [@diizzyy](https://github.com/diizzyy) for adding support for Intel 2.5G and Synopsys DesignWare nic driver in freebsd plugin. - [@Saruspete](https://github.com/Saruspete) for fixing handling of adding slabs after discovery in slabinfo plugin. - [@mjtice](https://github.com/mjtice) for adding autovacuum and tx wraparound charts to python.d/postgres. - [@charoleizer](https://github.com/charoleizer) for adding PostgreSQL version to requirements section. - [@danmichaelo](https://github.com/danmichaelo) for fixing a typo in exporting docs. - [@oldgiova](https://github.com/oldgiova) for adding capsh check before issuing setcap cap_perfmon. - [@oldgiova](https://github.com/oldgiova) for adding Travis ctrl file for checking if changes happened. - [@0x3333](https://github.com/0x3333) for fixing an inconsistent status check in charts.d/apcupsd. - [@etienne-napoleone](https://github.com/etienne-napoleone) for adding terra related binaries to blockchains apps plugin group. - [@anayrat](https://github.com/anayrat) for fixing postgres replication_slot chart on standby. - [@vpiserchia](https://github.com/vpiserchia) for fixing handling of null values returned by _cat/indices API in python.d/elasticsearch. - [@elelayan](https://github.com/elelayan) for fixing zpool state parsing in proc/zfs. - [@steffenweber](https://github.com/steffenweber) for adding missing privilege to fix MySQL slave reporting. - [@unhandled-exception](https://github.com/unhandled-exception) for adding sorting of the list of databases in alphabetical order in python.d/postgres. - [@78Star](https://github.com/78Star) for updating Netdata and its dependencies versions for pfSense. - [@unhandled-exception](https://github.com/unhandled-exception) for fixing crashing of the wal query if wal-file was removed concurrently in python.d/postgres. - [@rupokify](https://github.com/rupokify) for updating jQuery dependency. - [@caleno](https://github.com/caleno) for fixing a typo in streaming docs. - [@rex4539](https://github.com/rex4539) for fixing typos. ## Dashboard - Add various updates to dashboard info ([#11639](https://github.com/netdata/netdata/pull/11639), [@ilyam8](https://github.com/ilyam8)) - Add timex plugin chart descriptions ([#11635](https://github.com/netdata/netdata/pull/11635), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin zfs chart descriptions ([#11630](https://github.com/netdata/netdata/pull/11630), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin infiniband chart descriptions ([#11628](https://github.com/netdata/netdata/pull/11628), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin pagetypeinfo chart descriptions ([#11627](https://github.com/netdata/netdata/pull/11627), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_wireless chart descriptions ([#11626](https://github.com/netdata/netdata/pull/11626), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_rpc_nfs and net_rpc_nfsd chart descriptions ([#11625](https://github.com/netdata/netdata/pull/11625), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin power_supply chart descriptions ([#11619](https://github.com/netdata/netdata/pull/11619), [@ilyam8](https://github.com/ilyam8)) - Add cgroups plugin systemd services chart descriptions ([#11618](https://github.com/netdata/netdata/pull/11618), [@ilyam8](https://github.com/ilyam8)) - Add cgroups plugin chart descriptions ([#11607](https://github.com/netdata/netdata/pull/11607), [@ilyam8](https://github.com/ilyam8)) - Add apps plugin chart descriptions ([#11601](https://github.com/netdata/netdata/pull/11601), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin vmstat chart descriptions ([#11597](https://github.com/netdata/netdata/pull/11597), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin ksm chart descriptions ([#11595](https://github.com/netdata/netdata/pull/11595), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin edac chart descriptions ([#11589](https://github.com/netdata/netdata/pull/11589), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin stat chart descriptions ([#11586](https://github.com/netdata/netdata/pull/11586), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_stat_synproxy chart descriptions ([#11581](https://github.com/netdata/netdata/pull/11581), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin softirqs chart descriptions ([#11577](https://github.com/netdata/netdata/pull/11577), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_stat_conntrack chart descriptions ([#11576](https://github.com/netdata/netdata/pull/11576), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin uptime chart descriptions ([#11569](https://github.com/netdata/netdata/pull/11569), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_sockstat and net_sockstat6 chart descriptions ([#11567](https://github.com/netdata/netdata/pull/11567), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_snmp6 chart descriptions ([#11565](https://github.com/netdata/netdata/pull/11565), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_sctp_snmp chart descriptions ([#11564](https://github.com/netdata/netdata/pull/11564), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_snmp chart descriptions ([#11557](https://github.com/netdata/netdata/pull/11557), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_netstat chart descriptions ([#11554](https://github.com/netdata/netdata/pull/11554), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_ip_vs_stats chart descriptions ([#11546](https://github.com/netdata/netdata/pull/11546), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin net_dev chart descriptions ([#11543](https://github.com/netdata/netdata/pull/11543), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin meminfo chart descriptions ([#11541](https://github.com/netdata/netdata/pull/11541), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin mdstat chart descriptions ([#11537](https://github.com/netdata/netdata/pull/11537), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin interrupts chart descriptions ([#11532](https://github.com/netdata/netdata/pull/11532), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin diskstats chart descriptions ([#11528](https://github.com/netdata/netdata/pull/11528), [@ilyam8](https://github.com/ilyam8)) - Add proc plugin ipc semaphores chart descriptions ([#11523](https://github.com/netdata/netdata/pull/11523), [@ilyam8](https://github.com/ilyam8)) - Remove 'vernemq.queue_messages_in_queues' from dashboard info ([#11403](https://github.com/netdata/netdata/pull/11403), [@ilyam8](https://github.com/ilyam8)) - Move MD arrays charts under Disks ([#11119](https://github.com/netdata/netdata/pull/11119), [@thiagoftsm](https://github.com/thiagoftsm)) --- ## Collectors ### New - Add Traefik collector (go.d/traefik) ([#605](https://github.com/netdata/go.d.plugin/pull/605), [@ilyam8](https://github.com/ilyam8)) - Add HAProxy collector (go.d/haproxy) ([#599](https://github.com/netdata/go.d.plugin/pull/599), [@ilyam8](https://github.com/ilyam8)) - Add Mongodb collector (go.d/mongodb) ([#598](https://github.com/netdata/go.d.plugin/pull/598), [@georgeok](https://github.com/georgeok)) - Add Ethereum Node collector (go.d/geth) ([#585](https://github.com/netdata/go.d.plugin/pull/585), [@odyslam](https://github.com/odyslam)) ### Improvements - Add AWS to apps_groups.conf ([#11826](https://github.com/netdata/netdata/pull/11826), [@ilyam8](https://github.com/ilyam8)) - Show stats for systemd protected mount points (diskspace plugin) ([#11767](https://github.com/netdata/netdata/pull/11767), [@vlvkobal](https://github.com/vlvkobal)) - Add support for v1.7.0+ (go.d/coredns) ([#619](https://github.com/netdata/go.d.plugin/pull/619), [@georgeok](https://github.com/georgeok)) - Add "/basic_status" job nginx.conf (go.d/nginx) ([#612](https://github.com/netdata/go.d.plugin/pull/612), [@ilyam8](https://github.com/ilyam8)) - Add sharding metrics (go.d/mongodb) ([#609](https://github.com/netdata/go.d.plugin/pull/609), [@georgeok](https://github.com/georgeok)) - Add thread operations metrics (go.d/mysql) ([#607](https://github.com/netdata/go.d.plugin/pull/607), [@ilyam8](https://github.com/ilyam8)) - Add replica sets metrics (go.d/mongodb) ([#604](https://github.com/netdata/go.d.plugin/pull/604), [@georgeok](https://github.com/georgeok)) - Add databases metrics (go.d/mongodb) ([#602](https://github.com/netdata/go.d.plugin/pull/602), [@georgeok](https://github.com/georgeok)) - Add more OS(OperatingSystem) charts (go.d/wmi) ([#593](https://github.com/netdata/go.d.plugin/pull/593), [@ilyam8](https://github.com/ilyam8)) - Add caddy job to prometheus.conf (go.d/prometheus) ([#581](https://github.com/netdata/go.d.plugin/pull/578), [@odyslam](https://github.com/odyslam)) - Add AOF file size metrics (go.d/redis) ([#578](https://github.com/netdata/go.d.plugin/pull/578), [@ilyam8](https://github.com/ilyam8)) - Add openethereum/geth jobs to prometheus.con (go.d/prometheus) ([#578](https://github.com/netdata/go.d.plugin/pull/578), [@odyslam](https://github.com/odyslam)) - Update whois/whois-parser packages and add timeout configuration option (go.d/whoisquery) ([#576](https://github.com/netdata/go.d.plugin/pull/576), [@ilyam8](https://github.com/ilyam8)) - Disable reporting min/avg/max group uptime by default (apps plugin) ([#11609](https://github.com/netdata/netdata/pull/11609), [@ilyam8](https://github.com/ilyam8)) - Add sorting of the list of databases in alphabetical order (python.d/postgres) ([#11580](https://github.com/netdata/netdata/pull/11580), [@unhandled-exception](https://github.com/unhandled-exception)) - Add terra related binaries to blockchains group (apps plugin) ([#11437](https://github.com/netdata/netdata/pull/11437), [@etienne-napoleone](https://github.com/etienne-napoleone)) - Add instruction per cycle charts (perf plugin) ([#11392](https://github.com/netdata/netdata/pull/11392), [@thiagoftsm](https://github.com/thiagoftsm)) - Add autovacuum and tx wraparound charts (python.d/postgres) ([#11267](https://github.com/netdata/netdata/pull/11267), [@mjtice](https://github.com/mjtice)) - Add support for Intel 2.5G and Synopsys DesignWare nic driver (freebsd plugin) ([#11251](https://github.com/netdata/netdata/pull/11251), [@diizzyy](https://github.com/diizzyy)) - Add web3 and blockchains groups (apps plugin) ([#11220](https://github.com/netdata/netdata/pull/11220), [@odyslam](https://github.com/odyslam)) - Implement merging user/stock configuration files (python.d plugin) ([#11217](https://github.com/netdata/netdata/pull/11217), [@ilyam8](https://github.com/ilyam8)) - Rename default job from 'local' to 'anomalies' (python.d/anomalies) ([#11178](https://github.com/netdata/netdata/pull/11178), [@andrewm4894](https://github.com/andrewm4894)) - Add standby lag and blocking transactions charts (python.d/postgres) ([#11169](https://github.com/netdata/netdata/pull/11169), [@filip-plata](https://github.com/filip-plata)) ### Bug fixes - Fix renaming for cgroups with dots in the path (cgroups plugin) ([#11775](https://github.com/netdata/netdata/pull/11775), [@vlvkobal](https://github.com/vlvkobal)) - Fix exiting on SIGPIPE (go.d plugin) ([#630](https://github.com/netdata/go.d.plugin/pull/630), [@ilyam8](https://github.com/ilyam8)) - Fix domain syntax validation (go.d/whoisquery) ([#629](https://github.com/netdata/go.d.plugin/pull/629), [@ilyam8](https://github.com/ilyam8)) - Fix missing NONE in valid request methods (go.d/squidlog) ([#621](https://github.com/netdata/go.d.plugin/pull/621), [@ilyam8](https://github.com/ilyam8)) - Remove wrong "queue_messages_in_queues" chart (go.d/vernemq) ([#601](https://github.com/netdata/go.d.plugin/pull/601), [@ilyam8](https://github.com/ilyam8)) - Fix HTTP/socket client initialization order (go.d/phpfpm) ([#591](https://github.com/netdata/go.d.plugin/pull/591), [@ilyam8](https://github.com/ilyam8)) - Fix scraping metrics when resources are not discovered (go.d/vsphere) ([#589](https://github.com/netdata/go.d.plugin/pull/589), [@ilyam8](https://github.com/ilyam8)) - Fix LTSV log format parsing (go.d/weblog) ([#584](https://github.com/netdata/go.d.plugin/pull/584), [@ilyam8](https://github.com/ilyam8)) - Fix expiration date parsing (go.d/whoisquery) ([#575](https://github.com/netdata/go.d.plugin/pull/575), [@ilyam8](https://github.com/ilyam8)) - Fix containers name resolution for crio/containerd runtime (cgroups plugin) ([#11756](https://github.com/netdata/netdata/pull/11756), [@ilyam8](https://github.com/ilyam8)) - Add sensors to charts.d.conf and add a note on how to enable it (charts.d plugin) ([#11715](https://github.com/netdata/netdata/pull/11715), [@ilyam8](https://github.com/ilyam8)) - Fix crashing of the wal query if wal-file was removed concurrently (python.d/postgres) ([#11697](https://github.com/netdata/netdata/pull/11697), [@unhandled-exception](https://github.com/unhandled-exception)) - Fix "lsns: unknown column" logging (cgroups plugin) ([#11687](https://github.com/netdata/netdata/pull/11687), [@ilyam8](https://github.com/ilyam8)) - Fix nfsd RPC metrics and remove unused nfsd charts and metrics (proc/nfsd) ([#11632](https://github.com/netdata/netdata/pull/11632), [@vlvkobal](https://github.com/vlvkobal)) - Fix "proc4ops" chart family (proc/nfsd) ([#11623](https://github.com/netdata/netdata/pull/11623), [@ilyam8](https://github.com/ilyam8)) - Fix swap size calculation (cgroups plugin) ([#11617](https://github.com/netdata/netdata/pull/11617), [@vlvkobal](https://github.com/vlvkobal)) - Fix RSS memory counter for systemd services (cgroups plugin) ([#11616](https://github.com/netdata/netdata/pull/11616), [@vlvkobal](https://github.com/vlvkobal)) - Fix VBE parsing (python.d/varnish) ([#11596](https://github.com/netdata/netdata/pull/11596), [@ilyam8](https://github.com/ilyam8)) - Remove unused synproxy chart (proc/synproxy) ([#11582](https://github.com/netdata/netdata/pull/11582), [@vlvkobal](https://github.com/vlvkobal)) - Fix zpool state parsing (proc/zfs) ([#11545](https://github.com/netdata/netdata/pull/11545), [@elelayan](https://github.com/elelayan)) - Fix null values returned by '_cat/indices' API (python.d/elasticsearch) ([#11501](https://github.com/netdata/netdata/pull/11501), [@vpiserchia](https://github.com/vpiserchia)) - Fix replication_slot chart on standby (python.d/postgres) ([#11455](https://github.com/netdata/netdata/pull/11455), [@anayrat](https://github.com/anayrat)) - Fix an inconsistent status check (charts.d/apcupsd) ([#11435](https://github.com/netdata/netdata/pull/11435), [@0x3333](https://github.com/0x3333)) - Fix plugin name (stats.d plugin) ([#11400](https://github.com/netdata/netdata/pull/11400), [@vlvkobal](https://github.com/vlvkobal)) - Fix plugin names (freebsd and macos plugins) ([#11398](https://github.com/netdata/netdata/pull/11398), [@vlvkobal](https://github.com/vlvkobal)) - Fix lack of "module" in chart definition (all chart.d modules) ([#11390](https://github.com/netdata/netdata/pull/11390), [@ilyam8](https://github.com/ilyam8)) - Fix various python modules charts contexts (python.d/smartd_log, mysql, zscores) ([#11310](https://github.com/netdata/netdata/pull/11310), [@ilyam8](https://github.com/ilyam8)) - Fix current operation charts title and context (proc/mdstat) ([#11289](https://github.com/netdata/netdata/pull/11289), [@ilyam8](https://github.com/ilyam8)) - Fix handling of adding slabs after discovery (slabinfo plugin) ([#11257](https://github.com/netdata/netdata/pull/11257), [@Saruspete](https://github.com/Saruspete)) - Fix calculation of sysctl mib size (freebsd plugin) ([#11159](https://github.com/netdata/netdata/pull/11159), [@MikaelUrankar](https://github.com/MikaelUrankar)) ## eBPF ### New - Add MD flush calls tracking ([#11681](https://github.com/netdata/netdata/pull/11681), [@UmanShahzad](https://github.com/UmanShahzad)) - Add shared memory system calls tracking ([#11560](https://github.com/netdata/netdata/pull/11560), [@UmanShahzad](https://github.com/UmanShahzad)) - Add OOM kills tracking ([#11470](https://github.com/netdata/netdata/pull/11470), [@UmanShahzad](https://github.com/UmanShahzad)) - Add soft IRQ latency tracking ([#11445](https://github.com/netdata/netdata/pull/11445), [@UmanShahzad](https://github.com/UmanShahzad)) - Add hard IRQ latency tracking ([#11410](https://github.com/netdata/netdata/pull/11410), [@UmanShahzad](https://github.com/UmanShahzad)) - Add mount/umount calls tracking ([#11358](https://github.com/netdata/netdata/pull/11358), [@thiagoftsm](https://github.com/thiagoftsm)) - Add btrfs latency monitoring ([#11348](https://github.com/netdata/netdata/pull/11348), [@thiagoftsm](https://github.com/thiagoftsm)) - Add ZFS latency monitoring ([#11330](https://github.com/netdata/netdata/pull/11330), [@thiagoftsm](https://github.com/thiagoftsm)) - Add NFS latency monitoring ([#11313](https://github.com/netdata/netdata/pull/11313), [@thiagoftsm](https://github.com/thiagoftsm)) - Add disk latency monitoring ([#11276](https://github.com/netdata/netdata/pull/11276), [@thiagoftsm](https://github.com/thiagoftsm)) - Add XFS latency monitoring ([#11238](https://github.com/netdata/netdata/pull/11238), [@thiagoftsm](https://github.com/thiagoftsm)) - Add ext4 latency monitoring ([#11224](https://github.com/netdata/netdata/pull/11224), [@thiagoftsm](https://github.com/thiagoftsm)) - Add extended swap monitoring ([#11090](https://github.com/netdata/netdata/pull/11090), [@thiagoftsm](https://github.com/thiagoftsm)) ### Improvements - Add (eBPF) to submenu ([#11721](https://github.com/netdata/netdata/pull/11721), [@thiagoftsm](https://github.com/thiagoftsm)) - Process monitoring cleanup and improvements ([#11643](https://github.com/netdata/netdata/pull/11643), [@thiagoftsm](https://github.com/thiagoftsm)) - Add integration with cgroups plugin (socket, shared memory, cachestat) ([#11642](https://github.com/netdata/netdata/pull/11642), [@thiagoftsm](https://github.com/thiagoftsm)) - Add integration with cgroups plugin (process, file descriptor, VFS, directory cache and OOMkill) ([#11611](https://github.com/netdata/netdata/pull/11611), [@thiagoftsm](https://github.com/thiagoftsm)) - Add initial integration with cgroups plugin (swap) ([#11573](https://github.com/netdata/netdata/pull/11573), [@thiagoftsm](https://github.com/thiagoftsm)) - Add integration with cgroups plugin (create shared memory with cgroups) ([#11559](https://github.com/netdata/netdata/pull/11559), [@thiagoftsm](https://github.com/thiagoftsm)) - Update charts descriptions ([#11547](https://github.com/netdata/netdata/pull/11547), [@thiagoftsm](https://github.com/thiagoftsm)) - Convert eBPF submenus to lowercase ([#11511](https://github.com/netdata/netdata/pull/11511), [@thiagoftsm](https://github.com/thiagoftsm)) - Socket monitoring code improvements and update charts descriptions ([#11441](https://github.com/netdata/netdata/pull/11441), [@thiagoftsm](https://github.com/thiagoftsm)) - Move file operation monitoring to a separate thread ([#11401](https://github.com/netdata/netdata/pull/11401), [@thiagoftsm](https://github.com/thiagoftsm)) - Add module names for threads ([#11387](https://github.com/netdata/netdata/pull/11387), [@thiagoftsm](https://github.com/thiagoftsm)) - Move repeating part of latency chart descriptions to the family level ([#11363](https://github.com/netdata/netdata/pull/11363), [@thiagoftsm](https://github.com/thiagoftsm)) - Reduce plugin's memory usage ([#11256](https://github.com/netdata/netdata/pull/11256), [@thiagoftsm](https://github.com/thiagoftsm)) - Assorted improvements and fixes ([#11230](https://github.com/netdata/netdata/pull/11230), [@thiagoftsm](https://github.com/thiagoftsm)) - Move VFS monitoring to a separate threads and add new charts ([#11187](https://github.com/netdata/netdata/pull/11187), [@thiagoftsm](https://github.com/thiagoftsm)) ### Bug fixes - Fix command line arguments ([#11670](https://github.com/netdata/netdata/pull/11670), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix hardirq/softirq value init logic ([#11471](https://github.com/netdata/netdata/pull/11471), [@UmanShahzad](https://github.com/UmanShahzad)) - Fix VFS index reference ([#11356](https://github.com/netdata/netdata/pull/11356), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix a case when multiple eBPF plugins are running ([#11287](https://github.com/netdata/netdata/pull/11287), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix applying configuration options ([#11253](https://github.com/netdata/netdata/pull/11253), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix a segfault when ebpf_load_program returns null pointer ([#11203](https://github.com/netdata/netdata/pull/11203), [@wangpei-nice](https://github.com/wangpei-nice)) - Fix a wrong pointer to a function and move parser to main thread ([#11152](https://github.com/netdata/netdata/pull/11152), [@thiagoftsm](https://github.com/thiagoftsm)) --- ## Health ### Improvements - Remove pihole_blocked_queries alert ([#11829](https://github.com/netdata/netdata/pull/11829), [@Ancairon](https://github.com/Ancairon)) - Improve check for supported -F parameter in sendmail ([#11506](https://github.com/netdata/netdata/pull/11506), [@MrZammler](https://github.com/MrZammler)) - Add custom e-mail headers ([#11454](https://github.com/netdata/netdata/pull/11454), [@MrZammler](https://github.com/MrZammler)) - Add 'cockroachdb_underreplicated_ranges' alarm ([#11360](https://github.com/netdata/netdata/pull/11360), [@ilyam8](https://github.com/ilyam8)) - Disable 'oom_kill' alarm on k8s nodes ([#11359](https://github.com/netdata/netdata/pull/11359), [@ilyam8](https://github.com/ilyam8)) - Add geth stock alarms ([#11341](https://github.com/netdata/netdata/pull/11341), [@odyslam](https://github.com/odyslam)) - Remove pythond modules specific last_collected alarms ([#11307](https://github.com/netdata/netdata/pull/11307), [@ilyam8](https://github.com/ilyam8)) - Remove CockroachDB deprecated alarms ([#11235](https://github.com/netdata/netdata/pull/11235), [@ilyam8](https://github.com/ilyam8)) - Add new email notification template ([#11219](https://github.com/netdata/netdata/pull/11219), [@MrZammler](https://github.com/MrZammler)) - Add system clock synchronization state alarm ([#11177](https://github.com/netdata/netdata/pull/11177), [@ilyam8](https://github.com/ilyam8)) - Add python.d/go.d jobs last_collected_secs alarms ([#11168](https://github.com/netdata/netdata/pull/11168), [@ilyam8](https://github.com/ilyam8)) - Make stocks alarms less sensitive ([#11153](https://github.com/netdata/netdata/pull/11153), [@ilyam8](https://github.com/ilyam8)) ### Bug fixes - Fix swap_used alarm calculation ([#11672](https://github.com/netdata/netdata/pull/11672), [@ilyam8](https://github.com/ilyam8)) - Fix ram level alarms ([#11452](https://github.com/netdata/netdata/pull/11452), [@ilyam8](https://github.com/ilyam8)) - Fix 'gearman_workers_queued' alarm ([#11361](https://github.com/netdata/netdata/pull/11361), [@ilyam8](https://github.com/ilyam8)) - Fix sending MS Teams notifications to multiple channels ([#11355](https://github.com/netdata/netdata/pull/11355), [@ilyam8](https://github.com/ilyam8)) - Fix sendmail 'unrecognized option: F' issue ([#11283](https://github.com/netdata/netdata/pull/11283), [@MrZammler](https://github.com/MrZammler)) - Update old logo to new one ([#11263](https://github.com/netdata/netdata/pull/11263), [@odyslam](https://github.com/odyslam)) - Swap class and type attributes in stock alarm configurations ([#11240](https://github.com/netdata/netdata/pull/11240), [@MrZammler](https://github.com/MrZammler)) - Fix alarm line 'charts' matching ([#11204](https://github.com/netdata/netdata/pull/11204), [@ilyam8](https://github.com/ilyam8)) --- ## Documentation - Updating ansible steps for clarity ([#11823](https://github.com/netdata/netdata/pull/11823), [@kickoke](https://github.com/kickoke)) - Add a note about pkg-config file location for freeipmi ([#11831](https://github.com/netdata/netdata/pull/11831), [@vlvkobal](https://github.com/vlvkobal)) - Fix broken link in charts.mdx ([#11808](https://github.com/netdata/netdata/pull/11808), [@DShreve2](https://github.com/DShreve2)) - Fix typos ([#11782](https://github.com/netdata/netdata/pull/11782), [@rex4539](https://github.com/rex4539)) - Add nightly release version to readme ([#11780](https://github.com/netdata/netdata/pull/11780), [@andrewm4894](https://github.com/andrewm4894)) - Fix link to new charts ([#11773](https://github.com/netdata/netdata/pull/11773), [@DShreve2](https://github.com/DShreve2)) - Fix typos in netdata-security.md ([#11772](https://github.com/netdata/netdata/pull/11772), [@jlbriston](https://github.com/jlbriston)) - Update eBPF documentation (Filesystem and HardIRQ) ([#11752](https://github.com/netdata/netdata/pull/11752), [@UmanShahzad](https://github.com/UmanShahzad)) - Add command for new health entity file ([#11733](https://github.com/netdata/netdata/pull/11733), [@DShreve2](https://github.com/DShreve2)) - Remove dated contact suggestion ([#11732](https://github.com/netdata/netdata/pull/11732), [@DShreve2](https://github.com/DShreve2)) - Add documentation about Filesystem and HardIRQ ([#11752](https://github.com/netdata/netdata/pull/11752), [@UmanShahzad](https://github.com/UmanShahzad)) - Fix a typo in streaming docs ([#11747](https://github.com/netdata/netdata/pull/11747), [@caleno](https://github.com/caleno)) - Update eBPF documentation ([#11741](https://github.com/netdata/netdata/pull/11741), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix broken link - Charts 2.0 ([#11729](https://github.com/netdata/netdata/pull/11729), [@DShreve2](https://github.com/DShreve2)) - Fix broken link - eBPF plugin ([#11728](https://github.com/netdata/netdata/pull/11728), [@DShreve2](https://github.com/DShreve2)) - Add Cloud sign-up link ([#11714](https://github.com/netdata/netdata/pull/11714), [@DShreve2](https://github.com/DShreve2)) - Update claiming instructions for Docker ([#11713](https://github.com/netdata/netdata/pull/11713), [@DShreve2](https://github.com/DShreve2)) - Fix broken links in kickstart.md ([#11708](https://github.com/netdata/netdata/pull/11708), [@DShreve2](https://github.com/DShreve2)) - Add missing collectors to the eBPF plugin readme ([#11703](https://github.com/netdata/netdata/pull/11703), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix broken link - Charts 2.0 ([#11701](https://github.com/netdata/netdata/pull/11701), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update Netdata and dependencies versions for pfSense ([#11674](https://github.com/netdata/netdata/pull/11674), [@78Star](https://github.com/78Star)) - Add a note about new release of charts on the Cloud ([#11637](https://github.com/netdata/netdata/pull/11637), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update optional parameters for upcoming installer ([#11604](https://github.com/netdata/netdata/pull/11604), [@DShreve2](https://github.com/DShreve2)) - Add missing privilege to fix MySQL slave reporting ([#11574](https://github.com/netdata/netdata/pull/11574), [@steffenweber](https://github.com/steffenweber)) - Fix broken links ([#11540](https://github.com/netdata/netdata/pull/11540), [@ilyam8](https://github.com/ilyam8)) - Update london demo to point at london3 ([#11533](https://github.com/netdata/netdata/pull/11533), [@andrewm4894](https://github.com/andrewm4894)) - Add a note about handling backslashes in health configuration files ([#11527](https://github.com/netdata/netdata/pull/11527), [@ilyam8](https://github.com/ilyam8)) - Improve streaming documentation wording ([#11510](https://github.com/netdata/netdata/pull/11510), [@siamaktavakoli](https://github.com/siamaktavakoli)) - Fix a typo in claiming docs ([#11492](https://github.com/netdata/netdata/pull/11492), [@car12o](https://github.com/car12o)) - Remove broken link ([#11482](https://github.com/netdata/netdata/pull/11482), [@andrewm4894](https://github.com/andrewm4894)) - Add a note on how to find web files directory for custom dashboards ([#11461](https://github.com/netdata/netdata/pull/11461), [@ilyam8](https://github.com/ilyam8)) - Update "Install Netdata on Synology" guide ([#11449](https://github.com/netdata/netdata/pull/11449), [@ilyam8](https://github.com/ilyam8)) - Update installation documentation ([#11442](https://github.com/netdata/netdata/pull/11442), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update eBPF documentation ([#11440](https://github.com/netdata/netdata/pull/11440), [@thiagoftsm](https://github.com/thiagoftsm)) - Add time controls and timezone selector description ([#11433](https://github.com/netdata/netdata/pull/11433), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix broken links - Custom dashboards ([#11413](https://github.com/netdata/netdata/pull/11413), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix broken links - Custom dashboards ([#11405](https://github.com/netdata/netdata/pull/11405), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Rename claiming action to connect ([#11378](https://github.com/netdata/netdata/pull/11378), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix a typo in exporting docs ([#11376](https://github.com/netdata/netdata/pull/11376), [@danmichaelo](https://github.com/danmichaelo)) - Add PostgreSQL version to requirements section ([#11328](https://github.com/netdata/netdata/pull/11328), [@charoleizer](https://github.com/charoleizer)) - Minor fixes ([#11320](https://github.com/netdata/netdata/pull/11320), [@UmanShahzad](https://github.com/UmanShahzad)) - Fix prometheus node CPU alert rule ([#11309](https://github.com/netdata/netdata/pull/11309), [@ilyam8](https://github.com/ilyam8)) - Updated get-started.mdx ([#11303](https://github.com/netdata/netdata/pull/11303), [@jlbriston](https://github.com/jlbriston)) - Add Legacy/NG ACLK documentation ([#11243](https://github.com/netdata/netdata/pull/11243), [@underhood](https://github.com/underhood)) - Add links to data privacy page ([#11226](https://github.com/netdata/netdata/pull/11226), [@joelhans](https://github.com/joelhans)) - Add Microsoft Teams to supported notification endpoints ([#11205](https://github.com/netdata/netdata/pull/11205), [@zanechua](https://github.com/zanechua)) - Add a link to python.d/httpcheck.conf ([#11182](https://github.com/netdata/netdata/pull/11182), [@gotjoshua](https://github.com/gotjoshua)) - Fix broken links ([#11175](https://github.com/netdata/netdata/pull/11175), [@joelhans](https://github.com/joelhans)) - Update news about the latest release ([#11165](https://github.com/netdata/netdata/pull/11165), [@joelhans](https://github.com/joelhans)) ## Packaging / Installation - Use pip3 when installing git-semver package ([#11817](https://github.com/netdata/netdata/pull/11817), [@maneamarius](https://github.com/maneamarius)) - Add POWER8+ static builds ([#11802](https://github.com/netdata/netdata/pull/11802), [@Ferroin](https://github.com/Ferroin)) - Update libbpf to v0.5.1 ([#11800](https://github.com/netdata/netdata/pull/11800), [@thiagoftsm](https://github.com/thiagoftsm)) - Verify checksums of makeself deps ([#11791](https://github.com/netdata/netdata/pull/11791), [@vkalintiris](https://github.com/vkalintiris)) - Update go.d.plugin version to v0.31.0 ([#11789](https://github.com/netdata/netdata/pull/11789), [@ilyam8](https://github.com/ilyam8)) - Add Oracle Linux 8 to CI and package builds ([#11776](https://github.com/netdata/netdata/pull/11776), [@Ferroin](https://github.com/Ferroin)) - Fix a typo in installation script ([#11766](https://github.com/netdata/netdata/pull/11766), [@ShimonOhayon](https://github.com/ShimonOhayon)) - Update dashboard to v2.20.11 ([#11743](https://github.com/netdata/netdata/pull/11743)) - Minor improvement to CPU number function regarding macOS. ([#11746](https://github.com/netdata/netdata/pull/11746), [@iigorkarpov](https://github.com/iigorkarpov)) - Add log grouping in installer and static build code when running under GitHub Actions. ([#11720](https://github.com/netdata/netdata/pull/11720), [@Ferroin](https://github.com/Ferroin)) - Add basic telemetry to the new kickstart script. ([#11718](https://github.com/netdata/netdata/pull/11718), [@Ferroin](https://github.com/Ferroin)) - Add eBPF plugin to static binaries ([#11709](https://github.com/netdata/netdata/pull/11709), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix libbpf handling in RPM package builds. ([#11702](https://github.com/netdata/netdata/pull/11702), [@Ferroin](https://github.com/Ferroin)) - Don't use api.github.com when checking for latest stable version ([#11700](https://github.com/netdata/netdata/pull/11700), [@ilyam8](https://github.com/ilyam8)) - Fix handling of disabling telemetry in static installs. ([#11689](https://github.com/netdata/netdata/pull/11689), [@Ferroin](https://github.com/Ferroin)) - Mark g++ for freebsd as NOTREQUIRED ([#11678](https://github.com/netdata/netdata/pull/11678), [@MrZammler](https://github.com/MrZammler)) - Optimize static build and update various dependencies. ([#11660](https://github.com/netdata/netdata/pull/11660), [@Ferroin](https://github.com/Ferroin)) - Improve installation on systems with limited RAM. ([#11658](https://github.com/netdata/netdata/pull/11658), [@Ferroin](https://github.com/Ferroin)) - Add support for local builds to the new kickstart script. ([#11654](https://github.com/netdata/netdata/pull/11654), [@Ferroin](https://github.com/Ferroin)) - Explicitly opt out of LTO in RPM builds. ([#11644](https://github.com/netdata/netdata/pull/11644), [@Ferroin](https://github.com/Ferroin)) - Add flag to mark containers as created from official images in analytics. ([#11606](https://github.com/netdata/netdata/pull/11606), [@Ferroin](https://github.com/Ferroin)) - Add POWER8+ support to our official Docker images. ([#11592](https://github.com/netdata/netdata/pull/11592), [@Ferroin](https://github.com/Ferroin)) - Disable eBPF compilation in different platforms ([#11566](https://github.com/netdata/netdata/pull/11566), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix installer flag --use-system-protobuf ([#11539](https://github.com/netdata/netdata/pull/11539), [@underhood](https://github.com/underhood)) - Re-add EPEL on CentOS 7. ([#11525](https://github.com/netdata/netdata/pull/11525), [@Ferroin](https://github.com/Ferroin)) - Use the correct exit status for the updater with static updates. ([#11520](https://github.com/netdata/netdata/pull/11520), [@Ferroin](https://github.com/Ferroin)) - Remove `reset_netdata_trace.sh` from netdata.service ([#11517](https://github.com/netdata/netdata/pull/11517), [@ilyam8](https://github.com/ilyam8)) - Install basic netdata deps by default. ([#11508](https://github.com/netdata/netdata/pull/11508), [@Ferroin](https://github.com/Ferroin)) - Fix handling of claiming in kickstart script when running as non-root. ([#11507](https://github.com/netdata/netdata/pull/11507), [@Ferroin](https://github.com/Ferroin)) - Use system copy of protobuf in Docker images and static builds. ([#11496](https://github.com/netdata/netdata/pull/11496), [@Ferroin](https://github.com/Ferroin)) - Add initial implementation of new kickstart script. ([#11493](https://github.com/netdata/netdata/pull/11493), [@Ferroin](https://github.com/Ferroin)) - Add static builds for ARMv7l and ARMv8a ([#11490](https://github.com/netdata/netdata/pull/11490), [@Ferroin](https://github.com/Ferroin)) - Add the ability to allow arbitrary options to be passed to make from netdata-installer.sh. ([#11479](https://github.com/netdata/netdata/pull/11479), [@Ferroin](https://github.com/Ferroin)) - Embed build architecture in static build archive names. ([#11463](https://github.com/netdata/netdata/pull/11463), [@Ferroin](https://github.com/Ferroin)) - Fix edge repository configuration DEB packages. ([#11458](https://github.com/netdata/netdata/pull/11458), [@Ferroin](https://github.com/Ferroin)) - Add check for failed protobuf configure or make ([#11450](https://github.com/netdata/netdata/pull/11450), [@MrZammler](https://github.com/MrZammler)) - Don’t bail early if we fail to build cloud deps with required cloud. ([#11446](https://github.com/netdata/netdata/pull/11446), [@Ferroin](https://github.com/Ferroin)) - Change default to not using LTO for builds. ([#11432](https://github.com/netdata/netdata/pull/11432), [@Ferroin](https://github.com/Ferroin)) - Use DebHelper compat level 9 in repoconfig packages to support Ubuntu 16.04 ([#11426](https://github.com/netdata/netdata/pull/11426), [@Ferroin](https://github.com/Ferroin)) - Add capsh check before issuing setcap cap_perfmon ([#11386](https://github.com/netdata/netdata/pull/11386), [@oldgiova](https://github.com/oldgiova)) - Update handling of builds of bundled dependencies. ([#11375](https://github.com/netdata/netdata/pull/11375), [@Ferroin](https://github.com/Ferroin)) - Add support for bundling protobuf as part of the install. ([#11374](https://github.com/netdata/netdata/pull/11374), [@Ferroin](https://github.com/Ferroin)) - Properly handle eBPF plugin in RPM packages. ([#11362](https://github.com/netdata/netdata/pull/11362), [@Ferroin](https://github.com/Ferroin)) - Add support for claiming existing installs via kickstarter scripts. ([#11350](https://github.com/netdata/netdata/pull/11350), [@Ferroin](https://github.com/Ferroin)) - Assorted kickstart install fixes. ([#11342](https://github.com/netdata/netdata/pull/11342), [@Ferroin](https://github.com/Ferroin)) - Add aclk-schemas to dist_noinst_DATA ([#11338](https://github.com/netdata/netdata/pull/11338), [@underhood](https://github.com/underhood)) - Auto-detect PGID in Dockerfile's ENTRYPOINT script ([#11274](https://github.com/netdata/netdata/pull/11274), [@odyslam](https://github.com/odyslam)) - Add code for repository configuration packages. ([#11273](https://github.com/netdata/netdata/pull/11273), [@Ferroin](https://github.com/Ferroin)) - Explicitly update libarchive on CentOS 8 when installing dependencies. ([#11264](https://github.com/netdata/netdata/pull/11264), [@Ferroin](https://github.com/Ferroin)) - Fix kickstart-static64.sh install script fail when trying to access `.install-type` before it is created ([#11262](https://github.com/netdata/netdata/pull/11262), [@ilyam8](https://github.com/ilyam8)) - Add openSUSE 15.3 package builds. ([#11259](https://github.com/netdata/netdata/pull/11259), [@Ferroin](https://github.com/Ferroin)) - Fix libjudy installation on CentOS 8. ([#11248](https://github.com/netdata/netdata/pull/11248), [@Ferroin](https://github.com/Ferroin)) - Fix `install_type` detection during update ([#11199](https://github.com/netdata/netdata/pull/11199), [@ilyam8](https://github.com/ilyam8)) - Store info about the installation type for later retrieval. ([#11157](https://github.com/netdata/netdata/pull/11157), [@Ferroin](https://github.com/Ferroin)) - Compile/Link with absolute paths for bundled/vendored deps. ([#11129](https://github.com/netdata/netdata/pull/11129), [@vkalintiris](https://github.com/vkalintiris)) - Fix writing updater log under root ([#10901](https://github.com/netdata/netdata/pull/10901), [@nabijaczleweli](https://github.com/nabijaczleweli)) - Add ARM binary package builds to CI. ([#10769](https://github.com/netdata/netdata/pull/10769), [@Ferroin](https://github.com/Ferroin)) ## Other Notable Changes ### Improvements - Clean compilation warnings ([#11810](https://github.com/netdata/netdata/pull/11810), [@stelfrag](https://github.com/stelfrag)) - Fix coverity issues ([#11809](https://github.com/netdata/netdata/pull/11809), [@stelfrag](https://github.com/stelfrag)) - Add commands to check and fix database corruption ([#11828](https://github.com/netdata/netdata/pull/11828), [@stelfrag](https://github.com/stelfrag)) - Use two digits after the decimal point for the anomaly rate. ([#11804](https://github.com/netdata/netdata/pull/11804), [@vkalintiris](https://github.com/vkalintiris)) - Always queue alerts to aclk_alert ([#11806](https://github.com/netdata/netdata/pull/11806), [@MrZammler](https://github.com/MrZammler)) - Add some logging for cloud new architecture to access.log ([#11788](https://github.com/netdata/netdata/pull/11788), [@MrZammler](https://github.com/MrZammler)) - Delete from aclk alerts table if ack'ed from cloud one day ago ([#11779](https://github.com/netdata/netdata/pull/11779), [@MrZammler](https://github.com/MrZammler)) - Remove feature flag for ACLK new cloud architecture ([#11774](https://github.com/netdata/netdata/pull/11774), [@stelfrag](https://github.com/stelfrag)) - Insert alert into aclk_alert directly instead of queuing it ([#11769](https://github.com/netdata/netdata/pull/11769), [@MrZammler](https://github.com/MrZammler)) - Store and submit dimension delete messages for new cloud architecture ([#11765](https://github.com/netdata/netdata/pull/11765), [@stelfrag](https://github.com/stelfrag)) - Implement cloud initiated disconnect command ([#11723](https://github.com/netdata/netdata/pull/11723), [@underhood](https://github.com/underhood)) - Announce proto capability and enable if cloud supports ([#11476](https://github.com/netdata/netdata/pull/11476), [@underhood](https://github.com/underhood)) - Add exit points between env and OTP ([#11751](https://github.com/netdata/netdata/pull/11751), [@underhood](https://github.com/underhood)) - Improve the ACLK sync process for the new cloud architecture ([#11744](https://github.com/netdata/netdata/pull/11744), [@stelfrag](https://github.com/stelfrag)) - Disable C++ warnings from dlib library. ([#11738](https://github.com/netdata/netdata/pull/11738), [@vkalintiris](https://github.com/vkalintiris)) - Add queue removed alerts to cloud for new architecture ([#11704](https://github.com/netdata/netdata/pull/11704), [@MrZammler](https://github.com/MrZammler)) - Add support to stream chart labels on a parent - child setup ([#11675](https://github.com/netdata/netdata/pull/11675), [@MrZammler](https://github.com/MrZammler)) - Add snapshot message for cloud new architecture ([#11664](https://github.com/netdata/netdata/pull/11664), [@MrZammler](https://github.com/MrZammler)) - Add protobuf to `-W buildinfo` output. ([#11634](https://github.com/netdata/netdata/pull/11634), [@Ferroin](https://github.com/Ferroin)) - Add new alarm status protocol messages ([#11612](https://github.com/netdata/netdata/pull/11612), [@underhood](https://github.com/underhood)) - Add local webserver API/v1 call "aclk" ([#11588](https://github.com/netdata/netdata/pull/11588), [@underhood](https://github.com/underhood)) - Make New Cloud architecture optional for ACLK-NG ([#11587](https://github.com/netdata/netdata/pull/11587), [@underhood](https://github.com/underhood)) - Enable additional functionality for the new cloud architecture ([#11579](https://github.com/netdata/netdata/pull/11579), [@stelfrag](https://github.com/stelfrag)) - Add alert message support for ACLK new architecture ([#11552](https://github.com/netdata/netdata/pull/11552), [@MrZammler](https://github.com/MrZammler)) - Add support for Anomaly Detection MVP ([#11548](https://github.com/netdata/netdata/pull/11548), [@vkalintiris](https://github.com/vkalintiris)) - Add New Cloud Protocol files to CMake ([#11536](https://github.com/netdata/netdata/pull/11536), [@underhood](https://github.com/underhood)) - Add archive uploads for dist, package build, and static build checks. ([#11534](https://github.com/netdata/netdata/pull/11534), [@Ferroin](https://github.com/Ferroin)) - Add node message support for ACLK new architecture ([#11514](https://github.com/netdata/netdata/pull/11514), [@stelfrag](https://github.com/stelfrag)) - Clean netdata naming ([#11484](https://github.com/netdata/netdata/pull/11484), [@andrewm4894](https://github.com/andrewm4894)) - Add aclk/cloud state command to netdatacli ([#11462](https://github.com/netdata/netdata/pull/11462), [@underhood](https://github.com/underhood)) - Add chart message support for ACLK new architecture ([#11447](https://github.com/netdata/netdata/pull/11447), [@stelfrag](https://github.com/stelfrag)) - Add Alert Related API for new protocol ([#11424](https://github.com/netdata/netdata/pull/11424), [@underhood](https://github.com/underhood)) - Update SQLite version from v3.33.0 to 3.36.0 ([#11423](https://github.com/netdata/netdata/pull/11423), [@stelfrag](https://github.com/stelfrag)) - Add SQLite unit tests ([#11422](https://github.com/netdata/netdata/pull/11422), [@stelfrag](https://github.com/stelfrag)) - Add NodeInstanceInfo API ([#11419](https://github.com/netdata/netdata/pull/11419), [@underhood](https://github.com/underhood)) - Use SQLite to store the health log and alert configurations. ([#11399](https://github.com/netdata/netdata/pull/11399), [@MrZammler](https://github.com/MrZammler)) - Add ACLK synchronization event loop ([#11396](https://github.com/netdata/netdata/pull/11396), [@stelfrag](https://github.com/stelfrag)) - Add HTTP basic authentication to Prometheus remote write and HTTP versions of Graphite, JSON, OpenTSDB ([#11394](https://github.com/netdata/netdata/pull/11394), [@vlvkobal](https://github.com/vlvkobal)) - Add new Cloud chart related parsers and generators ([#11393](https://github.com/netdata/netdata/pull/11393), [@underhood](https://github.com/underhood)) - Remove warning when GCC 8.x is used ([#11389](https://github.com/netdata/netdata/pull/11389), [@thiagoftsm](https://github.com/thiagoftsm)) - Add support to allow ACLK-NG to grow MQTT buffer ([#11340](https://github.com/netdata/netdata/pull/11340), [@underhood](https://github.com/underhood)) - Add support for bundled protobuf ([#11335](https://github.com/netdata/netdata/pull/11335), [@underhood](https://github.com/underhood)) - Add ACLK-NG cloud request type charts ([#11326](https://github.com/netdata/netdata/pull/11326), [@UmanShahzad](https://github.com/UmanShahzad)) - Add HTTP access log messages for ACLK-NG ([#11318](https://github.com/netdata/netdata/pull/11318), [@UmanShahzad](https://github.com/UmanShahzad)) - Add a log message when the page cache manager sleeps for more than 1 second. ([#11314](https://github.com/netdata/netdata/pull/11314), [@vkalintiris](https://github.com/vkalintiris)) - Add hop count for children ([#11311](https://github.com/netdata/netdata/pull/11311), [@stelfrag](https://github.com/stelfrag)) - Remove access check for install-type file ([#11288](https://github.com/netdata/netdata/pull/11288), [@MrZammler](https://github.com/MrZammler)) - Support TLS SNI in ACLK-NG ([#11285](https://github.com/netdata/netdata/pull/11285), [@underhood](https://github.com/underhood)) - Make ACLK-NG the default if available ([#11272](https://github.com/netdata/netdata/pull/11272), [@underhood](https://github.com/underhood)) - Add extra posthog attributes ([#11237](https://github.com/netdata/netdata/pull/11237), [@MrZammler](https://github.com/MrZammler)) - Add support to ACLK-NG for new Cloud NodeInstance related msgs ([#11234](https://github.com/netdata/netdata/pull/11234), [@underhood](https://github.com/underhood)) - Add support so ACLK NG and Legacy can coexist ([#11225](https://github.com/netdata/netdata/pull/11225), [@underhood](https://github.com/underhood)) - Move cleanup of obsolete charts to a separate thread ([#11222](https://github.com/netdata/netdata/pull/11222), [@vlvkobal](https://github.com/vlvkobal)) - Add check to only report the exit code when anonymous statistics script fails ([#11215](https://github.com/netdata/netdata/pull/11215), [@MrZammler](https://github.com/MrZammler)) - Reduce memory needed per dimension ([#11212](https://github.com/netdata/netdata/pull/11212), [@stelfrag](https://github.com/stelfrag)) - Improve dbengine intialization to ignore journal files that can not be read ([#11210](https://github.com/netdata/netdata/pull/11210), [@stelfrag](https://github.com/stelfrag)) - Use memory mode RAM if memory mode dbengine is specified but not available ([#11207](https://github.com/netdata/netdata/pull/11207), [@stelfrag](https://github.com/stelfrag)) - Improve return status check for the execution of anonymous statistics script ([#11188](https://github.com/netdata/netdata/pull/11188), [@MrZammler](https://github.com/MrZammler)) - Reuse the SN_EXISTS bit to track anomaly status. ([#11154](https://github.com/netdata/netdata/pull/11154), [@vkalintiris](https://github.com/vkalintiris)) - Remove deprecated command line options ([#11149](https://github.com/netdata/netdata/pull/11149), [@vkalintiris](https://github.com/vkalintiris)) - Remove unecessary relative paths when including headers. ([#11124](https://github.com/netdata/netdata/pull/11124), [@vkalintiris](https://github.com/vkalintiris)) - Add field to provide UTC offset in seconds and edit health config command ([#11051](https://github.com/netdata/netdata/pull/11051), [@MrZammler](https://github.com/MrZammler)) ### Bug fixes - Set NETDATA_CONTAINER_OS_DETECTION properly ([#11827](https://github.com/netdata/netdata/pull/11827), [@MrZammler](https://github.com/MrZammler)) - Fix agent crash when ACLK sync thread is not initialized ([#11820](https://github.com/netdata/netdata/pull/11820), [@MrZammler](https://github.com/MrZammler)) - Simple fix for the data API query ([#11787](https://github.com/netdata/netdata/pull/11787), [@vlvkobal](https://github.com/vlvkobal)) - Use the proper format specifier when logging configuration options. ([#11795](https://github.com/netdata/netdata/pull/11795), [@vkalintiris](https://github.com/vkalintiris)) - Use correct hop count if host is already in memory ([#11785](https://github.com/netdata/netdata/pull/11785), [@stelfrag](https://github.com/stelfrag)) - Fix proc/interrupts parser ([#11783](https://github.com/netdata/netdata/pull/11783), [@maximethebault](https://github.com/maximethebault)) - Skip sending hidden dimensions via ACLK ([#11770](https://github.com/netdata/netdata/pull/11770), [@stelfrag](https://github.com/stelfrag)) - Fix host hop count reported to the cloud ([#11768](https://github.com/netdata/netdata/pull/11768), [@stelfrag](https://github.com/stelfrag)) - Fix log if D_ACLK is used ([#11763](https://github.com/netdata/netdata/pull/11763), [@underhood](https://github.com/underhood)) - Fix retention message duration when no local metrics are found ([#11762](https://github.com/netdata/netdata/pull/11762), [@stelfrag](https://github.com/stelfrag)) - Fix an issue with incomplete payload served when https is enabled ([#11754](https://github.com/netdata/netdata/pull/11754), [@MrZammler](https://github.com/MrZammler)) - Fix a type in the popocorn information message ([#11745](https://github.com/netdata/netdata/pull/11745), [@underhood](https://github.com/underhood)) - Fix /api/v1/info if ml-info is missing ([#11739](https://github.com/netdata/netdata/pull/11739), [@MrZammler](https://github.com/MrZammler)) - Fix typo in aclk_query.c ([#11737](https://github.com/netdata/netdata/pull/11737), [@eltociear](https://github.com/eltociear)) - Fix online chart in NG not updated properly ([#11734](https://github.com/netdata/netdata/pull/11734), [@underhood](https://github.com/underhood)) - Fix coverity CID #373610 ([#11719](https://github.com/netdata/netdata/pull/11719), [@MrZammler](https://github.com/MrZammler)) - Fix loading old and custom dashboards ([#11710](https://github.com/netdata/netdata/pull/11710), [@rupokify](https://github.com/rupokify)) - Fix coverity issues 373612 & 373611 ([#11684](https://github.com/netdata/netdata/pull/11684), [@MrZammler](https://github.com/MrZammler)) - Fix warnings from -Wformat-truncation=2 ([#11676](https://github.com/netdata/netdata/pull/11676), [@MrZammler](https://github.com/MrZammler)) - Fix interval usage and reduce I/O ([#11662](https://github.com/netdata/netdata/pull/11662), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix build issue related to legacy aclk and new arch code ([#11655](https://github.com/netdata/netdata/pull/11655), [@MrZammler](https://github.com/MrZammler)) - Fix typo in URL when calling env ([#11651](https://github.com/netdata/netdata/pull/11651), [@underhood](https://github.com/underhood)) - Fix false poll timeout ([#11650](https://github.com/netdata/netdata/pull/11650), [@underhood](https://github.com/underhood)) - Fix chart config overflow ([#11645](https://github.com/netdata/netdata/pull/11645), [@stelfrag](https://github.com/stelfrag)) - Fix an overflow when unsigned integer subtracted ([#11638](https://github.com/netdata/netdata/pull/11638), [@vlvkobal](https://github.com/vlvkobal)) - Fix coverity issues 373400-373402 ([#11631](https://github.com/netdata/netdata/pull/11631), [@stelfrag](https://github.com/stelfrag)) - Fix proper initialization struct with zeroes ([#11621](https://github.com/netdata/netdata/pull/11621), [@MrZammler](https://github.com/MrZammler)) - Fix https client ([#11608](https://github.com/netdata/netdata/pull/11608), [@underhood](https://github.com/underhood)) - Fix CID 339027 and reverse arguments ([#11578](https://github.com/netdata/netdata/pull/11578), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix resource leak when analytics thread stops ([#11575](https://github.com/netdata/netdata/pull/11575), [@MrZammler](https://github.com/MrZammler)) - Fix coverity report issues CID_373247-373251 ([#11549](https://github.com/netdata/netdata/pull/11549), [@stelfrag](https://github.com/stelfrag)) - Fix coverity issues for health config ([#11535](https://github.com/netdata/netdata/pull/11535), [@MrZammler](https://github.com/MrZammler)) - Fix issue with log messages appearing in the terminal instead of the error.log on startup ([#11524](https://github.com/netdata/netdata/pull/11524), [@stelfrag](https://github.com/stelfrag)) - Fix issues in Alarm API ([#11491](https://github.com/netdata/netdata/pull/11491), [@underhood](https://github.com/underhood)) - Fix list corruption in ACLK sync code and remove fatal ([#11444](https://github.com/netdata/netdata/pull/11444), [@stelfrag](https://github.com/stelfrag)) - Fix coverity reported issues 372243 - 372248 ([#11429](https://github.com/netdata/netdata/pull/11429), [@stelfrag](https://github.com/stelfrag)) - Fix CID 372233 to CID 372236 ([#11411](https://github.com/netdata/netdata/pull/11411), [@underhood](https://github.com/underhood)) - Fix bundled protobuf linkage on systems needing -latomic ([#11406](https://github.com/netdata/netdata/pull/11406), [@underhood](https://github.com/underhood)) - Fix coverity issue 372222 ([#11404](https://github.com/netdata/netdata/pull/11404), [@stelfrag](https://github.com/stelfrag)) - Fix typo in analytics.c ([#11329](https://github.com/netdata/netdata/pull/11329), [@eltociear](https://github.com/eltociear)) - Fix coverity errors in ACLK ([#11322](https://github.com/netdata/netdata/pull/11322), [@underhood](https://github.com/underhood)) - Fix confusing error in ACLK Legacy ([#11278](https://github.com/netdata/netdata/pull/11278), [@underhood](https://github.com/underhood)) - Fix an issue to send correct aclk implementation used by agent to posthog. ([#11247](https://github.com/netdata/netdata/pull/11247), [@MrZammler](https://github.com/MrZammler)) - Fix error on --disable-cloud ([#11244](https://github.com/netdata/netdata/pull/11244), [@underhood](https://github.com/underhood)) - Fix mqtt_websockets submodule version ([#11196](https://github.com/netdata/netdata/pull/11196), [@underhood](https://github.com/underhood)) - Fix claiming script exit code when daemon not running and the claim was successful ([#11195](https://github.com/netdata/netdata/pull/11195), [@ilyam8](https://github.com/ilyam8)) - Fix loading of class, component and type from health log when sufficient fields are detected. ([#11193](https://github.com/netdata/netdata/pull/11193), [@MrZammler](https://github.com/MrZammler)) - Fix issue with mqtt_websockets on FreeBSD ([#11172](https://github.com/netdata/netdata/pull/11172), [@underhood](https://github.com/underhood)) - Fix typo in aclk.c ([#11170](https://github.com/netdata/netdata/pull/11170), [@eltociear](https://github.com/eltociear)) - Fix mqtt_websockets on MacOS ([#11145](https://github.com/netdata/netdata/pull/11145), [@underhood](https://github.com/underhood)) ## Deprecation notice An upcoming stable release of the Netdata agent will include a maintainability update to our base Docker image. A small percentage of users will find that all self-compiled packages must be manually rebuilt after the update, even if relocation/SONAME errors are not encountered. `--security-opt=seccomp=unconfined` can be passed with no default.json, but this introduces security vulnerabilities between the host and malicious code in the container. Alternatively, users can prepare for the update by upgrading to one of the following: - runc v1.0.0-rc93 - Docker 19.03.9 or greater AND libseccomp 2.4.2 or greater While Netdata previously avoided making this update to minimize inconvenience to our users, we are now facing a third-party end-of-life date, and we believe the minimal number of affected users substantiates the need for the change. Additionally, in a future stable release, we will be removing our legacy agent-to-cloud connection. Most users should see no change in this upgrade, but we will lose SOCKS 5 proxy support for the Netdata Cloud functionality, which will affect a small number of users. ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata agent, feel free to contact us by one of the following channels: - [Github](https://github.com/netdata): You can use our Github repo to report bugs and submit feature requests - [Community forum](https://community.netdata.cloud): You can visit our community forum for questions and training. - **NEW**: [Discord](https://discord.gg/2eduZdSeC7): You can jump into our Discord for interactive, synchronous help and discussion. More than 700 engineers are already using it! Join us! 2021-11-30T19:50:51+00:00 netdata 1.32.1 netdata 1.32.1 2021-12-14T15:32:48+00:00 # Netdata v1.32.1 Netdata v.1.32.1 is a patch release to address issues discovered since 1.32.0. This release contains bug fixes and documentation updates, including clarified instructions for ACLK and our Machine Learning (ML) functionality. We appreciate our community's help in identifying and diagnosing these issues so we could fix them quickly. We encourage users to upgrade to the latest version at their earliest convenience. ## Acknowledgments - [@boxjan](https://github.com/boxjan) For providing a fix to correctly pass arguments in static builds. ## Documentation - Clean up anomaly-detection guide docs ([#11901](https://github.com/netdata/netdata/pull/11901), [@andrewm4894](https://github.com/andrewm4894)) - Add Swagger docs for new `/api/v1/aclk` endpoint ([#11881](https://github.com/netdata/netdata/pull/11881), [@underhood](https://github.com/underhood)) - Minor ACLK documentation updates ([#11882](https://github.com/netdata/netdata/pull/11882), [@underhood](https://github.com/underhood)) - Add z score alarm example ([#11871](https://github.com/netdata/netdata/pull/11871), [@andrewm4894](https://github.com/andrewm4894)) - Create ML README.md ([#11848](https://github.com/netdata/netdata/pull/11848), [@andrewm4894](https://github.com/andrewm4894)) - Update nightly badge link ([#11843](https://github.com/netdata/netdata/pull/11843), [@ilyam8](https://github.com/ilyam8)) ## Packaging / Installation - Fix postdrop handling for systemd systems. ([#11885](https://github.com/netdata/netdata/pull/11885), [@Ferroin](https://github.com/Ferroin)) - Don't produce output when static update succeeded ([#11879](https://github.com/netdata/netdata/pull/11879), [@ilyam8](https://github.com/ilyam8)) - Make netdata-updater.sh POSIX compliant. ([#11755](https://github.com/netdata/netdata/pull/11755), [@Ferroin](https://github.com/Ferroin)) - Fix exit code when updating static install && updater script ([#11873](https://github.com/netdata/netdata/pull/11873), [@ilyam8](https://github.com/ilyam8)) - Fix passing of extra arguements in static builds ([#11852](https://github.com/netdata/netdata/pull/11852), [@boxjan](https://github.com/boxjan)) - Explicitly conflict with distro netdata DEB packages. ([#11855](https://github.com/netdata/netdata/pull/11855), [@Ferroin](https://github.com/Ferroin)) - Bump static builds to use Alpine 3.15 as a base. ([#11836](https://github.com/netdata/netdata/pull/11836), [@Ferroin](https://github.com/Ferroin)) - Detect whether libatomic should be linked in when using CXX linker. ([#11818](https://github.com/netdata/netdata/pull/11818), [@vkalintiris](https://github.com/vkalintiris)) - Fix token name in release draft workflow. ([#11847](https://github.com/netdata/netdata/pull/11847), [@Ferroin](https://github.com/Ferroin)) - Remove OpenSUSE Leap 15.2 from CI. ([#11600](https://github.com/netdata/netdata/pull/11600), [@Ferroin](https://github.com/Ferroin)) - Remove Fedora 33 from CI. ([#11640](https://github.com/netdata/netdata/pull/11640), [@Ferroin](https://github.com/Ferroin)) ## Bug Fixes - Use the chart id instead of chart name in response to incoming cloud context queries ([#11898](https://github.com/netdata/netdata/pull/11898), [@stelfrag](https://github.com/stelfrag)) - Fix used_swap alarm calculation ([#11868](https://github.com/netdata/netdata/pull/11868), [@ilyam8](https://github.com/ilyam8)) - Initialize enabled parameter to 1 in AlarmLogHealth message ([#11856](https://github.com/netdata/netdata/pull/11856), [@MrZammler](https://github.com/MrZammler)) 2021-12-14T15:32:48+00:00 netdata v1.33.0 netdata v1.33.0 2022-01-26T16:04:55+00:00 # Release v1.33.0 Happy New Year to everyone in the Netdata community. After one of our biggest releases ever, we have re-energized over the holidays and are ready to continue helping more people troubleshoot their infrastructure. Hopefully you've already heard about the improvements we made to the kickstart script. With this release, we're adding even more features: - [Netdata is now distributed as pre-built packages on many Linux distributions](#installation) - [Stream compression (tech preview)](#streaming-compression) - [eBPF CO-RE support](#eBPF-CO-RE) > ❗We're also keeping our codebase healthy by removing end-of-life features. Read the [deprecation notice](#deprecation-notice) to check if you are affected. If you love Netdata and haven't given us a yet [Github star](https://github.com/netdata/netdata), please do, we would really appreciate it! ### Netdata open-source Agent growth The open-source Netdata Agent, the best OSS node monitoring and troubleshooting solution, currently has: - 1,300,000 unique Netdata nodes live! - An amazing adoption rate, with 3,300 new nodes per day! - 280,000 Docker pulls per day with 375 million total, according to DockerHub! ### Community news Netdata is supported both by an active community of global contributors and the Netdata staff. Get involved: - [Build with us on Netdata](https://github.com/netdata/netdata). - [Join us on Discord](https://discord.gg/mP7VD76Y), and say hello. - Contribute your knowledge with [open GitHub issues](https://github.com/netdata/netdata/issues). - Launching soon, be on the lookout for our new monthly community awards ## Release highlights <a id="installation"></a> ### Netdata is now distributed as pre-built packages on many Linux distributions We recently released a completely new version of our one-line installer code. Wherever available, our new kickstart script uses DEB or RPM packages provided by Netdata. These packages are tightly integrated with the package management system of the distribution, providing the best installation experience in a reliable and fast way. Already over 70% of our new installations use DEB or RPM packages! The updated kickstart script has several advantages over the old one: - It’s more advanced because it automatically selects the best supported installation method for your system. However, you can still explicitly ask for a specific type of installation method. - It’s more convenient as it requires no manual installation of packages on a majority of systems. - It’s more resource efficient on most systems, meaning less impact on your running workloads (and much faster installs on idle systems). 📄 Find the updated install documentation [on our official docs site](https://learn.netdata.cloud/docs/agent/packaging/installer/methods/kickstart). If you were using the old `kickstart.sh` script through a custom script or orchestration tool, you may need to update the options being passed to get it to behave like it used to (this will usually just involve adding `--build-only` to the options). Other installation types do not need to make any changes because of this. ### Stream compression (tech preview) <a id="streaming-compression"></a> The Agent's [streaming](https://learn.netdata.cloud/docs/agent/streaming) mechanism now supports stream compression. Streaming thousands of metrics between Netdata Agents increases your data availability and provides a more robust mechanism to monitor your metrics and troubleshoot problems. **Stream compression allows you to**: - Save up to 70% of bandwidth by reducing the size of transmitted metrics between Netdata Agents. - Therefore, reduce costs over metered data connections by up to 70%. - Take advantage of low-speed connections. Stream compression uses the lossless ["LZ4 - Extreme fast compression"](https://github.com/lz4/lz4) library. It achieves compression speeds up to 800Mbps, decompression speeds up to 4500Mbps with an average compression ratio between 2.0 and 3.0. Because this is a technical preview and we are still working to make it amazing, stream compression will be **disabled by default**. 📄 Learn how to [enable streaming between nodes](https://learn.netdata.cloud/docs/metrics-storage-management/enable-streaming). 📄 If you already stream between nodes, learn how to [enable streaming compression](https://learn.netdata.cloud/docs/agent/streaming#streaming-compression) > Note: Stream compression only works if all participating Netdata Agents are hosted on an OS which supports the library version [lz4 v1.9.0+](https://github.com/lz4/lz4). If a Netdata Agent does not detect the [lz4 v1.9.0+](https://github.com/lz4/lz4) library version, it will disable stream compression. ### eBPF CO-RE support <a id="eBPF-CO-RE"></a> In v1.32 we added some major improvements to our eBPF support. For this release, we’re taking the next step by gradually introducing BPF CO-RE support! Today, the distribution of eBPF programs is very challenging, because trying to compile an eBPF program with so many different Linux kernels is so complex. We want to make eBPF widely available to everyone without worrying about compatibility. And here is where eBPF CO-RE (Compile Once, Run Everywhere), part of libbpf, comes to the rescue. CO-RE is a modern approach to writing portable BPF applications that can run on multiple kernel versions and configurations without modifications and runtime source code compilation on the target machine. We now have the opportunity to focus on what matters, add more features, and improve performance of our eBPF offering! Furthermore, in this release we also introduce two new eBPF charts: * Threads info: Displays the total number of active eBPF threads and the number of all eBPF threads. * Load info: Measures the number of eBPF threads running on legacy code or CO-RE. ![Screenshot_20220125_213415](https://user-images.githubusercontent.com/13576110/151068688-d9fbeb1d-6759-4394-a371-15daa01ff248.png) ## Acknowledgments We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer is essential to our success. We thank you and look forward to continue to grow together to build a remarkable product. - [@NikolayS](https://github.com/NikolayS) for various improvements of python.d/postgres collector. - [@Saruspete](https://github.com/Saruspete) for fixing handling of port_rcv_data and port_xmit_data counters in proc/infiniband collector. - [@ardabbour](https://github.com/ardabbour) for fix errors in exporting walkthrough. - [@avstrakhov](https://github.com/avstrakhov) for adding LZ4 streaming data compression. - [@boxjan](https://github.com/boxjan) for fixing permissions of plugins for static builds. - [@candrews](https://github.com/candrews) for adding a note that Netdata is available on Gentoo. - [@cmd-ntrf](https://github.com/cmd-ntrf) for fixing claim node examples in kickstart(-64) documentation. - [@jsoref](https://github.com/jsoref) for fixing spelling. - [@laned130](https://github.com/laned130) for adding a missing expression operator to the health configuration reference. - [@lokerhp](https://github.com/lokerhp) for fixing a typo in the dashboard_info.js. - [@neotf](https://github.com/neotf) for adding memory usage chart to python.d/spigotmc collector. - [@pbouchez](https://github.com/pbouchez) for adding bar1 memory usage chart to python.d/nvidia_smi collector. - [@scatenag](https://github.com/scatenag) for fixing collecting user statistics for LDAP users in python.d/nvidia_smi collector. - [@sourcecodes2](https://github.com/sourcecodes2) for adding channels support to PushBullet notification method. - [@bompus](https://github.com/bompus) for fixing collecting replica set stats in go.d/mongodb collector. ## Collectors ### Improvements - Prefer python3 if available (python.d) ([#12001](https://github.com/netdata/netdata/pull/12001), [@ilyam8](https://github.com/ilyam8)) - Add bar1 memory usage chart (python.d/nvidia_smi) ([#11956](https://github.com/netdata/netdata/pull/11956), [@pbouchez](https://github.com/pbouchez)) - Add a note that Netfilter's "new" and "ignore" counters are removed in the latest kernel ([#11950](https://github.com/netdata/netdata/pull/11950), [@ilyam8](https://github.com/ilyam8)) - Consider mat. views as tables in table size/count chart (python.d/postgres) ([#11816](https://github.com/netdata/netdata/pull/11816), [@NikolayS](https://github.com/NikolayS)) - Use block_size instead of 8*1024 (python.d/postgres) ([#11815](https://github.com/netdata/netdata/pull/11815), [@NikolayS](https://github.com/NikolayS)) ### Bug fixes - Fix handling of port_rcv_data and port_xmit_data counters (proc/infiniband)([#11994](https://github.com/netdata/netdata/pull/11994), [@Saruspete](https://github.com/Saruspete)) - Fix handling of decoding errors in ExecutableService (python.d) ([#11979](https://github.com/netdata/netdata/pull/11979), [@ilyam8](https://github.com/ilyam8)) - Fix lack of sufficient system capabilities (perf.plugin) ([#11958](https://github.com/netdata/netdata/pull/11958), [@vlvkobal](https://github.com/vlvkobal)) - Fix Netfilter accounting charts priority (nfacct.plugin) ([#11952](https://github.com/netdata/netdata/pull/11952), [@ilyam8](https://github.com/ilyam8)) - Fix lack of sufficient system capabilities (nfacct.plugin) ([#11951](https://github.com/netdata/netdata/pull/11951), [@ilyam8](https://github.com/ilyam8)) - Fix collecting user statistics for LDAP users (python.d/nvidia_smi) ([#11858](https://github.com/netdata/netdata/pull/11858), [@scatenag](https://github.com/scatenag)) - Fix tps decode, and add memory usage chart (python.d/spigotmc) ([#11797](https://github.com/netdata/netdata/pull/11797), [@neotf](https://github.com/neotf)) - Fix collecting replica set stats (go.d/mongodb) ([#639](https://github.com/netdata/go.d.plugin/pull/639), [@bompus](https://github.com/bompus)) ## eBPF ### Improvements - Add ebpf.plugin informational charts and various optimizations ([#11992](https://github.com/netdata/netdata/pull/11992), [@thiagoftsm](https://github.com/thiagoftsm)) - Update libbpf library to v0.6.1 ([#11865](https://github.com/netdata/netdata/pull/11865), [@thiagoftsm](https://github.com/thiagoftsm)) ### Bug fixes - Fix disabling specific ebpf collectors ([#12014](https://github.com/netdata/netdata/pull/12014), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix cachestat on kernel 5.15.x ([#11833](https://github.com/netdata/netdata/pull/11833), [@thiagoftsm](https://github.com/thiagoftsm)) ## Health - Add sending notifications to channels support to PushBullet ([#11850](https://github.com/netdata/netdata/pull/11850), [@sourcecodes2](https://github.com/sourcecodes2)) ## Streaming - Add LZ4 streaming data compression ([#11821](https://github.com/netdata/netdata/pull/11821), [@avstrakhov](https://github.com/avstrakhov)) ## Documentation - Fix formatting in the streaming doc ([#12026](https://github.com/netdata/netdata/pull/12026), [@kickoke](https://github.com/kickoke)) - Fix a typo in the python.d/mongodb readme ([#12024](https://github.com/netdata/netdata/pull/12024), [@cboydstun](https://github.com/cboydstun)) - Add a note that streaming compression is disabled by default ([#12019](https://github.com/netdata/netdata/pull/12019), [@odynik](https://github.com/odynik)) - Refine the idlejitter.plugin docs ([#12012](https://github.com/netdata/netdata/pull/12012), [@kickoke](https://github.com/kickoke)) - Delete unused collectors quickstart guide ([#12000](https://github.com/netdata/netdata/pull/12000), [@kickoke](https://github.com/kickoke)) - Delete duplicate getting started doc ([#11978](https://github.com/netdata/netdata/pull/11978), [@kickoke](https://github.com/kickoke)) - Refine the python example for clarity ([#11989](https://github.com/netdata/netdata/pull/11989), [@kickoke](https://github.com/kickoke)) - Add alternative install command for macOS ([#11997](https://github.com/netdata/netdata/pull/11997), [@Ferroin](https://github.com/Ferroin)) - Refine the bash example for clarity ([#11990](https://github.com/netdata/netdata/pull/11990), [@kickoke](https://github.com/kickoke)) - Fix the statsd.plugin readme formatting ([#11943](https://github.com/netdata/netdata/pull/11943), [@kickoke](https://github.com/kickoke)) - Update SNMPv3 documentation ([#11959](https://github.com/netdata/netdata/pull/11959), [@kickoke](https://github.com/kickoke)) - Improve PagerDuty notification doc ([#11147](https://github.com/netdata/netdata/pull/11147), [@joelhans](https://github.com/joelhans)) - Fix spelling ([#10976](https://github.com/netdata/netdata/pull/10976), [@jsoref](https://github.com/jsoref)) - Fix tables side borders ([#11923](https://github.com/netdata/netdata/pull/11923), [@ilyam8](https://github.com/ilyam8)) - Fix claim node examples in kickstart(-64) documentation ([#11242](https://github.com/netdata/netdata/pull/11242), [@cmd-ntrf](https://github.com/cmd-ntrf)) - Fix the title of exporting reference doc ([#11252](https://github.com/netdata/netdata/pull/11252), [@joelhans](https://github.com/joelhans)) - Add a note with a link to guide for using on Pi ([#11605](https://github.com/netdata/netdata/pull/11605), [@andrewm4894](https://github.com/andrewm4894)) - Add "==" to the list of health expression operators ([#11905](https://github.com/netdata/netdata/pull/11905), [@laned130](https://github.com/laned130)) - Fix errors in exporting walkthrough ([#11902](https://github.com/netdata/netdata/pull/11902), [@ardabbour](https://github.com/ardabbour)) - Fix unresolved file references ([#11903](https://github.com/netdata/netdata/pull/11903), [@ilyam8](https://github.com/ilyam8)) - Add missing dependencies for New Cloud Architecture ([#11373](https://github.com/netdata/netdata/pull/11373), [@underhood](https://github.com/underhood)) - Indicate availability on Gentoo ([#7675](https://github.com/netdata/netdata/pull/7675), [@candrews](https://github.com/candrews)) ## Packaging / Installation - Fix cleanup from a failed DEB install ([#12006](https://github.com/netdata/netdata/pull/12006), [@Ferroin](https://github.com/Ferroin)) - Update go.d.plugin version to v0.31.2 ([#12005](https://github.com/netdata/netdata/pull/12005), [@ilyam8](https://github.com/ilyam8)) - Fix handling of static archive selection for installs. ([#12004](https://github.com/netdata/netdata/pull/12004), [@Ferroin](https://github.com/Ferroin)) - Fix install prefix handling for claiming code in new kickstart script ([#11999](https://github.com/netdata/netdata/pull/11999), [@Ferroin](https://github.com/Ferroin)) - Fix the updater script checksum validation for static builds ([#11986](https://github.com/netdata/netdata/pull/11986), [@ilyam8](https://github.com/ilyam8)) - Fix retrieving service commands without failure ([#11947](https://github.com/netdata/netdata/pull/11947), [@maneamarius](https://github.com/maneamarius)) - Fix getting the latest tag in the updater script ([#11908](https://github.com/netdata/netdata/pull/11908), [@maneamarius](https://github.com/maneamarius)) - Fix handling of agent restart on update. ([#11887](https://github.com/netdata/netdata/pull/11887), [@Ferroin](https://github.com/Ferroin)) - Fix permissions of plugins that may be built ([#11877](https://github.com/netdata/netdata/pull/11877), [@boxjan](https://github.com/boxjan)) - Fix the code that checks for available updates. ([#11870](https://github.com/netdata/netdata/pull/11870), [@Ferroin](https://github.com/Ferroin)) - Initial release of new kickstart script ([#11764](https://github.com/netdata/netdata/pull/11764), [@Ferroin](https://github.com/Ferroin)) - Add support to updater for updating native DEB/RPM installs with our official packages ([#11753](https://github.com/netdata/netdata/pull/11753), [@Ferroin](https://github.com/Ferroin)) ## Other notable changes ### Improvements - Add install type info to `-W buildinfo` output. ([#12010](https://github.com/netdata/netdata/pull/12010), [@Ferroin](https://github.com/Ferroin)) - Add support for NVME disks with blkext driver ([#12007](https://github.com/netdata/netdata/pull/12007), [@ralphm](https://github.com/ralphm)) - Perform a host metadata update on child reconnection ([#11965](https://github.com/netdata/netdata/pull/11965), [@stelfrag](https://github.com/stelfrag)) - Send ML feature information with UpdateNodeInfo ([#11913](https://github.com/netdata/netdata/pull/11913), [@vkalintiris](https://github.com/vkalintiris)) - Use absolute features when doing training/prediction. ([#11876](https://github.com/netdata/netdata/pull/11876), [@vkalintiris](https://github.com/vkalintiris)) - Send the cloud protocol used to posthog ([#11842](https://github.com/netdata/netdata/pull/11842), [@MrZammler](https://github.com/MrZammler)) - Remove ACLK Legacy ([#11841](https://github.com/netdata/netdata/pull/11841), [@underhood](https://github.com/underhood)) ### Bug fixes - Fix access to freed memory in ACLK ([#12015](https://github.com/netdata/netdata/pull/12015), [@underhood](https://github.com/underhood)) - Fix a typo in the dashboard_info.js spigot part ([#12008](https://github.com/netdata/netdata/pull/12008), [@lokerhp](https://github.com/lokerhp)) - Fix queue removed alerts ([#11996](https://github.com/netdata/netdata/pull/11996), [@MrZammler](https://github.com/MrZammler)) - Fix coverity 374746 ([#11973](https://github.com/netdata/netdata/pull/11973), [@MrZammler](https://github.com/MrZammler)) - Fix ACLK chart description ([#11970](https://github.com/netdata/netdata/pull/11970), [@underhood](https://github.com/underhood)) - Fix a broken link in dashboard_info.js ([#11948](https://github.com/netdata/netdata/pull/11948), [@Ancairon](https://github.com/Ancairon)) - Fix an error in configure.ac ([#11937](https://github.com/netdata/netdata/pull/11937), [@underhood](https://github.com/underhood)) - Fix handling of the "-url" parameter in the claiming script ([#11919](https://github.com/netdata/netdata/pull/11919), [@ilyam8](https://github.com/ilyam8)) - Fix time_t format ([#11897](https://github.com/netdata/netdata/pull/11897), [@vlvkobal](https://github.com/vlvkobal)) - Fix compiling with AWS Kinesis support ([#11867](https://github.com/netdata/netdata/pull/11867), [@vlvkobal](https://github.com/vlvkobal)) - Fix cmake build ([#11862](https://github.com/netdata/netdata/pull/11862), [@vlvkobal](https://github.com/vlvkobal)) - Fix compilation warnings ([#11846](https://github.com/netdata/netdata/pull/11846), [@vlvkobal](https://github.com/vlvkobal)) ## Code organization - Remove internal dbengine header from spawn/spawn_client.c ([#12009](https://github.com/netdata/netdata/pull/12009), [@vkalintiris](https://github.com/vkalintiris)) - Better handle creation of UUID for claiming ([#11974](https://github.com/netdata/netdata/pull/11974), [@Ferroin](https://github.com/Ferroin)) - Use libnetdata/required_dummies.h in collectors. ([#11971](https://github.com/netdata/netdata/pull/11971), [@vkalintiris](https://github.com/vkalintiris)) - Do not use dbengine headers when dbengine is disabled. ([#11967](https://github.com/netdata/netdata/pull/11967), [@vkalintiris](https://github.com/vkalintiris)) - Use libnetdata/required_dummies.h in collectors. ([#11971](https://github.com/netdata/netdata/pull/11971), [@vkalintiris](https://github.com/vkalintiris)) - Do not use dbengine headers when dbengine is disabled. ([#11967](https://github.com/netdata/netdata/pull/11967), [@vkalintiris](https://github.com/vkalintiris)) - Perform a host metadata update on child reconnection ([#11965](https://github.com/netdata/netdata/pull/11965), [@stelfrag](https://github.com/stelfrag)) - Remove bitfields from rrdhost. ([#11964](https://github.com/netdata/netdata/pull/11964), [@vkalintiris](https://github.com/vkalintiris)) - Update libmongoc CMake config ([#11962](https://github.com/netdata/netdata/pull/11962), [@vlvkobal](https://github.com/vlvkobal)) - Find host and pass host->health_enabled to cloud AlarmLogHealth message ([#11960](https://github.com/netdata/netdata/pull/11960), [@MrZammler](https://github.com/MrZammler)) - Compute platform-specific list of static_threads at runtime. ([#11955](https://github.com/netdata/netdata/pull/11955), [@vkalintiris](https://github.com/vkalintiris)) - Blocking publish and in flight buffer regrowth ([#11932](https://github.com/netdata/netdata/pull/11932), [@underhood](https://github.com/underhood)) - Try to find worker config thread from inactive threads for new architecture ([#11928](https://github.com/netdata/netdata/pull/11928), [@MrZammler](https://github.com/MrZammler)) - Handle re-claim while the agent is running in new architecture ([#11924](https://github.com/netdata/netdata/pull/11924), [@MrZammler](https://github.com/MrZammler)) - Include libatomic to allow protobuf to resolve __atomic functions ([#11917](https://github.com/netdata/netdata/pull/11917), [@MrZammler](https://github.com/MrZammler)) - Provide runtime ml info from a new endpoint ([#11886](https://github.com/netdata/netdata/pull/11886), [@vkalintiris](https://github.com/vkalintiris)) - Update dependencies for the pubsub exporting connector ([#11872](https://github.com/netdata/netdata/pull/11872), [@vlvkobal](https://github.com/vlvkobal)) - Remove ACLK-NG 'cmd' switch by message type ([#11866](https://github.com/netdata/netdata/pull/11866), [@underhood](https://github.com/underhood)) - Optimize rx msg name resolution ([#11811](https://github.com/netdata/netdata/pull/11811), [@underhood](https://github.com/underhood)) - Add localhost hostname to the edit_command ([#11793](https://github.com/netdata/netdata/pull/11793), [@MrZammler](https://github.com/MrZammler)) ## Deprecation notice <a id="deprecation-notice"></a> The following items will be removed in our next release: - **backends** subsystem. Has been replaced by the [exporting engine](https://learn.netdata.cloud/docs/agent/exporting). - **node.d/fronius** collector. Will be moved to the [netdata/community](https://github.com/netdata/community) repository. - **node.d/sma_webbox** collector. Will be moved to the [netdata/community](https://github.com/netdata/community) repository. - **node.d/stiebeleltron** collector. Will be moved to the [netdata/community](https://github.com/netdata/community) repository. - **node.d/named** collector. Has been replaced by [go.d/bind](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/bind). Will be moved to the [netdata/community](https://github.com/netdata/community) repository. ### Deprecated in this release Following our previous deprecation notice legacy ACLK support is officially removed in this release. See more information in our [last release notes (v1.32)](https://github.com/netdata/netdata/releases/tag/v1.32.0#deprecation-notice). ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata agent, feel free to contact us by one of the following channels: - [Github](https://github.com/netdata): You can use our Github repo to report bugs. - [Community forum](https://community.netdata.cloud): You can visit our community forum for questions and training. - [Discord](https://discord.gg/2eduZdSeC7): You can jump into our Discord for interactive, synchronous help and discussion. More than 800 engineers are already using it! Join us! 2022-01-26T16:04:55+00:00 netdata v1.33.1 netdata v1.33.1 2022-02-14T19:34:58+00:00 Netdata v1.33.1 is a patch release to address issues discovered since v1.33.0. This release contains bug fixes and documentation updates. If you also use Netdata Cloud, please note that we started migrating nodes running on the old architecture to the new one. **Most users don’t have to take any action on their part**, but if you are affected by the migration, a banner will be added to your Cloud dashboard with a link to further instructions. If you love Netdata and haven't yet considered giving us a [Github star](https://github.com/netdata/netdata), we would appreciate for you to do so! ## Acknowledgments - [@petecooper](https://github.com/petecooper) for fixing a typo and improving the installer script usage message. - [@mohammed90](https://github.com/mohammed90) for updating syntax for Caddy v2 in docker install guide. ## Dashboard - Add legacy protocol deprecation notification in the header ([#12117](https://github.com/netdata/netdata/pull/12117)) - Fix handling of `after` and `before` URL params in direct links ([#12052](https://github.com/netdata/netdata/pull/12052)) ## Documentation - Add a note that the Synology install guide is maintained by community ([#12086](https://github.com/netdata/netdata/pull/12086), [@ilyam8](https://github.com/ilyam8)) - Remove mention of libJudy in installation documentation for macOS ([#12080](https://github.com/netdata/netdata/pull/12080), [@vlvkobal](https://github.com/vlvkobal)) - Make requirement of mounting the docker socket for containers name resolution explicit ([#12079](https://github.com/netdata/netdata/pull/12079), [@ilyam8](https://github.com/ilyam8)) - Cleanup installation docs ([#12057](https://github.com/netdata/netdata/pull/12057), [@kickoke](https://github.com/kickoke)) - Improve Alerta readme ([#11944](https://github.com/netdata/netdata/pull/11944), [@kickoke](https://github.com/kickoke)) - Fix unresolved file references ([#12053](https://github.com/netdata/netdata/pull/12053), [@ilyam8](https://github.com/ilyam8)) - Update the docs to match new installation script ([#12042](https://github.com/netdata/netdata/pull/12042), [@kickoke](https://github.com/kickoke)) - Update syntax for Caddy v2 in docker install guide ([#12092](https://github.com/netdata/netdata/pull/12092), [@mohammed90](https://github.com/mohammed90)) - Fix paths to install boxes ([#12109](https://github.com/netdata/netdata/pull/12109), [@kickoke](https://github.com/kickoke)) - Fix matching the new install box component name ([#12106](https://github.com/netdata/netdata/pull/12106), [@kickoke](https://github.com/kickoke)) - Remove ACLK legacy documentation ([#12103](https://github.com/netdata/netdata/pull/12103), [@underhood](https://github.com/underhood)) - Add interactive kickstart scripts where possible ([#12098](https://github.com/netdata/netdata/pull/12098), [@kickoke](https://github.com/kickoke)) ## Packaging / Installation - Add native installation for Rocky linux ([#12081](https://github.com/netdata/netdata/pull/12081), [@maneamarius](https://github.com/maneamarius)) - Fix compilation errors for OpenSSL on macOS ([#12048](https://github.com/netdata/netdata/pull/12048), [@vlvkobal](https://github.com/vlvkobal)) - Fix handling of non-x86 static builds in updater ([#12055](https://github.com/netdata/netdata/pull/12055), [@Ferroin](https://github.com/Ferroin)) - Fix handling of removed packages with leftover config files in installer ([#12033](https://github.com/netdata/netdata/pull/12033), [@Ferroin](https://github.com/Ferroin)) - Improve the installer script usage message ([#12062](https://github.com/netdata/netdata/pull/12062), [@petecooper](https://github.com/petecooper)) - Add proper support for Oracle Linux native packages to installer ([#12101](https://github.com/netdata/netdata/pull/12101), [@Ferroin](https://github.com/Ferroin)) - Fix handling of Oracle Linux repoconfig packages ([#12100](https://github.com/netdata/netdata/pull/12100), [@Ferroin](https://github.com/Ferroin)) - Fix handling non-interactive installs as non-root users ([#12089](https://github.com/netdata/netdata/pull/12089), [@Ferroin](https://github.com/Ferroin)) - Add info about installer interactivity to anonymous installer telemetry events ([#12088](https://github.com/netdata/netdata/pull/12088), [@Ferroin](https://github.com/Ferroin)) - Make the netdata-installer script Posix compliant ([#11961](https://github.com/netdata/netdata/pull/11961), [@maneamarius](https://github.com/maneamarius)) - Make a lack of an os-release file non-fatal on install ([#12087](https://github.com/netdata/netdata/pull/12087), [@Ferroin](https://github.com/Ferroin)) ## Bug Fixes - Fix compilation errors cased by including "lz4.h" when stream compression is disabled ([#12049](https://github.com/netdata/netdata/pull/12049), [@odynik](https://github.com/odynik)) - Disable ebpf socket thread causing crashes on some systems ([#12085](https://github.com/netdata/netdata/pull/12085), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix ACLK reconnect endless loop ([#12074](https://github.com/netdata/netdata/pull/12074), [@underhood](https://github.com/underhood)) - Fix compilation errors when openssl is not available and compiling with --disable-https and --disable-cloud ([#12071](https://github.com/netdata/netdata/pull/12071), [@MrZammler](https://github.com/MrZammler)) ## Other Notable Changes - Adds legacy protocol deprecation banner to agent log ([#12065](https://github.com/netdata/netdata/pull/12065), [@underhood](https://github.com/underhood)) ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata agent, feel free to contact us by one of the following channels: - [Github](https://github.com/netdata): You can use our Github repo to report bugs or open a new feature request. - [Github Discussions](https://github.com/netdata/netdata/discussions): We are using Github discussions to document our development process so you can be a part of it. - [Community forum](https://community.netdata.cloud): You can visit our community forum for questions and training. - [Discord](https://discord.gg/mPZ6WZKKG2): You can jump into our Discord for interactive, synchronous help and discussion. More than 800 engineers are already using it! Join us! 2022-02-14T19:34:58+00:00 netdata 1.34.0 netdata 1.34.0 2022-04-14T17:57:41+00:00 **Table of contents** - [Kubernetes Monitoring: new charts for CPU throttling](#v1340-k8s) - [Machine learning (ML) powered anomaly detection](#v1340-ML-anomaly-detection) - [Streaming compression now enabled by default](#v1340-streaming) - [SNMP collector now runs on Go](#v1340-snmp) - [Improved installation experience](#v1340-packaging) <!-- Remove if there are no deprecations --> > ❗ We're keeping our codebase healthy by removing features that are end of life. Read the [deprecation notice](#deprecation-notice) to check if you are affected. <!-- Remove if there are no useful stats --> ### Netdata open-source Agent statistics We're proud to empower each and every one of you to troubleshoot your infrastructure using Netdata: - 7.3M+ troubleshooters monitor with Netdata - 1.3M+ unique nodes currently live - 3.3k+ new nodes per day - 51k+ Docker pulls per day with 387M all-time total If you're part of our community and love Netdata, please give us a star on [GitHub⭐](https://github.com/netdata/netdata). ## Release highlights ### Kubernetes Monitoring: New charts for CPU throttling <a id="v1340-k8s"></a> Have you seen your applications get stuck or fail to respond to health checks? It might be the CPU quota limit! Kubernetes relies on the kernel control group (cgroup) mechanisms to manage CPU constraints. The CPU quota is allocated based on a period of time, not on available CPU power. When an application has used its allotted quota for a given period, it gets throttled until the next period. So if you don’t set your CPU limits correctly, your applications will be throttled while your CPU may be idle. And CPU throttling is really hard to identify since Kubernetes only exposes usage metrics. In this release, we make troubleshooting Kubernetes even easier by adding two new charts for CPU throttling: - CPU throttled Runnable Periods: The percentage of runnable periods when tasks in a cgroup have been throttled. - CPU throttled Time Duration: The total time duration for which tasks in a cgroup have been throttled. ![image](https://user-images.githubusercontent.com/13576110/163075321-a0a8ce2f-fe75-487d-8891-37050039c6b3.png) ### Machine learning (ML) powered anomaly detection <a id="v1340-ML-anomaly-detection"></a> The performance of the machine learning threads have been significantly optimized in this release. We were able to reduce peak CPU usage considerably by sampling input data randomly and excluding constant metrics from training. That way, we've optimized performance while maintaining high levels of accuracy. If you're streaming data between nodes: We've optimized CPU usage on parent nodes with multiple child nodes by altering the training thread's max sleep time. ### Streaming compression is now in Alpha <a id="v1340-streaming"></a> We introduced streaming compression in [Netdata Agent v1.33.0](https://github.com/netdata/netdata/releases/v1.33.0#streaming-compression) as a tech preview. The feature has matured a lot since then so we are moving forward to alpha stage. From now on, streaming compression will be enabled by default, allowing you to leverage faster streaming between parent and child nodes at a lower bandwidth. ### SNMP collector now runs on Go <a id="v1340-snmp"></a> Go is known for its reliability and blazing speed - precisely what you need when monitoring networks. We've rewritten our SNMP collector from Node.js to Go. Apart from improved configuration options, the new collector eliminates the need for Node.js, slimming down our dependency tree. > Note: The node.js-based SNMP collector will be deprecated in the next release, see the [deprecation notice](#deprecation-notice). 📄 [SNMP Go collector documentation](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/snmp) ### Improved installation experience<a id="v1340-packaging"></a> We have been improving our [kickstart script](https://learn.netdata.cloud/docs/agent/packaging/installer/methods/kickstart) to give you a smooth installation experience. We've added some handy features like: - Dry run mode: Show what would be done without actually modifying the system, including reporting a number of common installation issues before they arise. - Overhauled auto-update management: Including support for auto-updates with our native packages and much easier control of whether auto updates are enabled or not. - Improved reinstallation support: With the new `--reinstall-clean` option, you can now have the kickstart script cleanly uninstall an existing installation before installing Netdata again. ## Acknowledgments We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer is essential to our success. We thank you and look forward to continue to grow together to build a remarkable product. - [@xrgman](https://github.com/xrgman) for fixing typos in our documentation. - [@wooyey](https://github.com/wooyey) for fixing a parsing error in python.d/hpssa collector. - [@tycho](https://github.com/tycho) for fixing python collector that use `sudo`. - [@tnagorran](https://github.com/tnagorran) for fixing a typo in the step-by-step Netdata guide. - [@rex4539](https://github.com/rex4539) for fixing typos. - [@petecooper](https://github.com/petecooper) for improving the installer script usage message. - [@godismyjudge95](https://github.com/godismyjudge95) for fixing a bug in the updater script. - [@fayak](https://github.com/fayak) for fixing parsing of claiming extra parameters in kickstart. - [@dvdmuckle](https://github.com/dvdmuckle) for fixing a typo in ZFS ARC Cache size dashboard info. - [@d--j](https://github.com/d--j) for fixing setting of 'time offset' configuration option in timex plugin. - [@cimnine](https://github.com/cimnine) for fixing a bug when `tar` can not set the correct permissions during installation. - [@AlexGhiti](https://github.com/AlexGhiti) for fixing building Netdata on riscv64. - [@Daniel15](https://github.com/Daniel15) for fixing license URL. - [@MariosMarinos](https://github.com/MariosMarinos) for fixing a typo in the anomaly-detection-python.md file. - [@RatishT](https://github.com/RatishT) for fixing typo in Running-behind-haproxy.md. - [@DanTheMediocre](https://github.com/DanTheMediocre) for improving timex plugin documentation and dashboard info. - [@DanTheMediocre](https://github.com/DanTheMediocre) for fixing a typo in anomaly-detection-python.md. - [@Steve8291](https://github.com/Steve8291) for fixing ioping_disk_latency alarm lookup value. - [@Steve8291](https://github.com/Steve8291) for fixing config file check in stock config directory in ioping plugin. - [@Steve8291](https://github.com/Steve8291) for adding a link to Netdata badges readme in the health documentation. - [@Steve8291](https://github.com/Steve8291) for fixing libnetfilter-acct-dev package name in nfacct plugin documentation. <!-- Repeat this structure for all new/ improved features. For example: Dashboards, Collectors, Notifications, etc. --> ## Collectors ### New collectors - Add CPU throttling charts (cgroups.plugin) ([#12591](https://github.com/netdata/netdata/pull/12591), [@ilyam8](https://github.com/ilyam8)) - Add clock status chart (timex.plugin) ([#12501](https://github.com/netdata/netdata/pull/12501), [@ilyam8](https://github.com/ilyam8)) - Add Asterisk configuration file with synthetic charts (statsd.plugin) ([#12381](https://github.com/netdata/netdata/pull/12381), [@ilyam8](https://github.com/ilyam8)) - Add new chart for process states metrics (apps.plugin) ([#12305](https://github.com/netdata/netdata/pull/12305), [@surajnpn](https://github.com/surajnpn)) - Add thermal zone metrics collection (go.d/wmi) ([#667](https://github.com/netdata/go.d.plugin/pull/667), [@ilyam8](https://github.com/ilyam8)) - Add SNMP data collector (go.d/snmp) ([#644](https://github.com/netdata/go.d.plugin/pull/644), [@surajnpn](https://github.com/surajnpn)) ### Improvements ⚙️ Enhancing our collectors to collect all the data you need. <details><summary>See all pull requests</summary> - Add 'locust' to apps_groups.conf ([#12498](https://github.com/netdata/netdata/pull/12498), [@andrewm4894](https://github.com/andrewm4894)) - Enable timex plugin for non-linux systems (timex.plugin) ([#12489](https://github.com/netdata/netdata/pull/12489), [@surajnpn](https://github.com/surajnpn)) - Prefer 'blkio.*_recursive' files when available (cgroups.plugin) ([#12462](https://github.com/netdata/netdata/pull/12462), [@ilyam8](https://github.com/ilyam8)) - Add 'stress-ng' and 'gremlin' to apps_groups.conf (apps.plugin) ([#12165](https://github.com/netdata/netdata/pull/12165), [@andrewm4894](https://github.com/andrewm4894)) - Add Apple Filing Protocol daemons into 'afp' group (apps.plugin) ([#12078](https://github.com/netdata/netdata/pull/12078), [@ilyam8](https://github.com/ilyam8)) - Show the number of processes/threads for empty apps groups (apps.plugin) ([#11834](https://github.com/netdata/netdata/pull/11834), [@vlvkobal](https://github.com/vlvkobal)) - Add a configuration option to set application (go.d/prometheus) ([#669](https://github.com/netdata/go.d.plugin/pull/669), [@ilyam8](https://github.com/ilyam8)) </details> ### Bug fixes 🐞 Improving our collectors one bug fix at a time. <details><summary>See all pull requests</summary> - Fix collecting data when 'ntp_adjtime' call fails (timex.plugin) ([#12667](https://github.com/netdata/netdata/pull/12667), [@vlvkobal](https://github.com/vlvkobal)) - Fix chart titles with instance-specific information ([#12644](https://github.com/netdata/netdata/pull/12644), [@ilyam8](https://github.com/ilyam8)) - Fix CPU utilization calculation (cgroups.plugin) ([#12622](https://github.com/netdata/netdata/pull/12622), [@ilyam8](https://github.com/ilyam8)) - Fix checking for IOMainPort on MacOS (macos.plugin) ([#12600](https://github.com/netdata/netdata/pull/12600), [@vlvkobal](https://github.com/vlvkobal)) - Fix cgroup version detection with systemd (cgroups.plugin) ([#12553](https://github.com/netdata/netdata/pull/12553), [@vlvkobal](https://github.com/vlvkobal)) - Fix network charts context (cgroups.plugin) ([#12454](https://github.com/netdata/netdata/pull/12454), [@ilyam8](https://github.com/ilyam8)) - Fix sending unnecessary data in FreeBSD (apps.plugin) ([#12446](https://github.com/netdata/netdata/pull/12446), [@surajnpn](https://github.com/surajnpn)) - Fix charts context (cups.plugin) ([#12444](https://github.com/netdata/netdata/pull/12444), [@ilyam8](https://github.com/ilyam8)) - Fix recursion in apcupsd_check (charts.d/apcupsd) ([#12418](https://github.com/netdata/netdata/pull/12418), [@ilyam8](https://github.com/ilyam8)) - Fix double host prefix when Netdata running in a podman container (cgroups.plugin) ([#12380](https://github.com/netdata/netdata/pull/12380), [@ilyam8](https://github.com/ilyam8)) - Fix config file check in stock config directory (ioping.plugin) ([#12327](https://github.com/netdata/netdata/pull/12327), [@Steve8291](https://github.com/Steve8291)) - Fix setting of 'time offset' configuration option (timex.plugin) ([#12281](https://github.com/netdata/netdata/pull/12281), [@d--j](https://github.com/d--j)) - Fix logical drive data parsing error (python.d/hpssa) ([#12206](https://github.com/netdata/netdata/pull/12206), [@wooyey](https://github.com/wooyey)) - Fix getting username when UID is unknown on the host (python.d/nvisia_smi) ([#12184](https://github.com/netdata/netdata/pull/12184), [@ilyam8](https://github.com/ilyam8)) - Fix a typo in ZFS ARC Cache size info ([#12138](https://github.com/netdata/netdata/pull/12138), [@dvdmuckle](https://github.com/dvdmuckle)) - Fix collecting of renamed metrics (go.d/k8s_kubelet) ([#674](https://github.com/netdata/go.d.plugin/pull/674), [@ilyam8](https://github.com/ilyam8)) - Fix reading stock configuration files in k8s (go.d.plugin) ([#670](https://github.com/netdata/go.d.plugin/pull/670), [@ilyam8](https://github.com/ilyam8)) - Fix runtime chart context hard coding (go.d.plugin) ([#668](https://github.com/netdata/go.d.plugin/pull/668), [@ilyam8](https://github.com/ilyam8)) - Fix failed check because of invalid metric type (go.d/prometheus) ([#665](https://github.com/netdata/go.d.plugin/pull/665), [@ilyam8](https://github.com/ilyam8)) - Fix handling of replica set charts dimensions (go.d/mongodb) ([#646](https://github.com/netdata/go.d.plugin/pull/646), [@ilyam8](https://github.com/ilyam8)) </details> ## eBPF ### Improvements - Improve chart titles and dashboard info ([#12665](https://github.com/netdata/netdata/pull/12665), [@thiagoftsm](https://github.com/thiagoftsm)) - Update eBPF dashboard info ([#12617](https://github.com/netdata/netdata/pull/12617), [@thiagoftsm](https://github.com/thiagoftsm)) - Update links in the dashboard info ([#12581](https://github.com/netdata/netdata/pull/12581), [@thiagoftsm](https://github.com/thiagoftsm)) - Add monitoring for inbound and outbound connections ([#12532](https://github.com/netdata/netdata/pull/12532), [@thiagoftsm](https://github.com/thiagoftsm)) - Improve eBPF dashboard info ([#12467](https://github.com/netdata/netdata/pull/12467), [@thiagoftsm](https://github.com/thiagoftsm)) - Add CO-RE support for eBPF plugin ([#12318](https://github.com/netdata/netdata/pull/12318), [@thiagoftsm](https://github.com/thiagoftsm)) - Update libbpf version and adjust eBPF modules for using new version of libbpf ([#12190](https://github.com/netdata/netdata/pull/12190), [@thiagoftsm](https://github.com/thiagoftsm)) ### Bug fixes 🐞 Improving eBPF integration one bug fix at a time. <details><summary>See all pull requests</summary> - Fix missing chart context for cgroups charts ([#12671](https://github.com/netdata/netdata/pull/12671), [@ilyam8](https://github.com/ilyam8)) - Fix eBFP plugin crash on exit ([#12590](https://github.com/netdata/netdata/pull/12590), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix unnecessary error log lines for proc and sys files ([#12385](https://github.com/netdata/netdata/pull/12385), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix removing pid file on exit ([#12379](https://github.com/netdata/netdata/pull/12379), [@thiagoftsm](https://github.com/thiagoftsm)) </details> ## Dashboard - Change color of Netdata logo on left sidebar ([#12607](https://github.com/netdata/netdata/pull/12607)) - Update Community section and the links for opening a new issue on GitHub in 'Need Help?' modal ([#12607](https://github.com/netdata/netdata/pull/12607)) - Add 'Netdata Cloud connection status' modal ([#12407](https://github.com/netdata/netdata/pull/12407)) ## Streaming - Fix parsing of 'os_name' for older agent versions streaming to a parent ([#12425](https://github.com/netdata/netdata/pull/12425), [@stelfrag](https://github.com/stelfrag)) - Deactivate streaming compression at runtime in case of a compressor buffer overflow ([#12037](https://github.com/netdata/netdata/pull/12037), [@odynik](https://github.com/odynik)) ## Exporting - Remove backends subsystem ([#12146](https://github.com/netdata/netdata/pull/12146), [@vlvkobal](https://github.com/vlvkobal)) ## Health - Fix ioping_disk_latency alarm green/red thresholds ([#12351](https://github.com/netdata/netdata/pull/12351), [@ilyam8](https://github.com/ilyam8)) - Fix ioping_disk_latency alarm lookup value ([#12329](https://github.com/netdata/netdata/pull/12329), [@Steve8291](https://github.com/Steve8291)) - Adjust 10s_ipv4_tcp_resets_sent alarm warn expression ([#12320](https://github.com/netdata/netdata/pull/12320), [@ilyam8](https://github.com/ilyam8)) - Add alarms for charts.d/nut collector ([#12285](https://github.com/netdata/netdata/pull/12285), [@ilyam8](https://github.com/ilyam8)) - Fix respecting of 'delay' parameter when using 'repeat' feature ([#12164](https://github.com/netdata/netdata/pull/12164), [@erdem2000](https://github.com/erdem2000)) ## ML - Fix training/prediction stats charts context ([#12610](https://github.com/netdata/netdata/pull/12610), [@vkalintiris](https://github.com/vkalintiris)) - Enable streaming of anomaly_detection.* charts ([#12606](https://github.com/netdata/netdata/pull/12606), [@vkalintiris](https://github.com/vkalintiris)) - Update ML-related charts ([#12574](https://github.com/netdata/netdata/pull/12574), [@vkalintiris](https://github.com/vkalintiris)) - Reduce min 'dbengine anomaly rate every' from 60s to 30s ([#12543](https://github.com/netdata/netdata/pull/12543), [@andrewm4894](https://github.com/andrewm4894)) - ML-related changes to address issue/discussion comments. ([#12494](https://github.com/netdata/netdata/pull/12494), [@vkalintiris](https://github.com/vkalintiris)) - Skip 'foreach' alarms for dimensions of anomaly rate chart. ([#12441](https://github.com/netdata/netdata/pull/12441), [@vkalintiris](https://github.com/vkalintiris)) - Prepend context in anomaly rate dimension id ([#12342](https://github.com/netdata/netdata/pull/12342), [@vkalintiris](https://github.com/vkalintiris)) - Skip training of constant metrics ([#12212](https://github.com/netdata/netdata/pull/12212), [@vkalintiris](https://github.com/vkalintiris)) - Track anomaly rates with DBEngine ([#12083](https://github.com/netdata/netdata/pull/12083), [@vkalintiris](https://github.com/vkalintiris)) ## Packaging / Installation 📦 "Handle with care" - Just like handling physical packages, we put in a lot of care and effort to publish beautiful software packages. <details><summary>See all pull requests</summary> - Summarize encountered errors and warnings at end of kickstart script run ([#12636](https://github.com/netdata/netdata/pull/12636), [@Ferroin](https://github.com/Ferroin)) - Fix logging an incorrect configuration option in kickstart ([#12657](https://github.com/netdata/netdata/pull/12657), [@MrZammler](https://github.com/MrZammler)) - Add eBPF CO-RE version and checksum files to distfile list ([#12627](https://github.com/netdata/netdata/pull/12627), [@Ferroin](https://github.com/Ferroin)) - Fix "print: command not found" issue in kickstart ([#12615](https://github.com/netdata/netdata/pull/12615), [@maneamarius](https://github.com/maneamarius)) - Check if libatomic can be linked ([#12583](https://github.com/netdata/netdata/pull/12583), [@MrZammler](https://github.com/MrZammler)) - Fix missing setuid bit for ioping.plugin after reinstalling Debian package ([#12580](https://github.com/netdata/netdata/pull/12580), [@ilyam8](https://github.com/ilyam8)) - Improve kickstart messaging ([#12577](https://github.com/netdata/netdata/pull/12577), [@Ferroin](https://github.com/Ferroin)) - Fix temporary directory handling for dependency handling script in updater ([#12562](https://github.com/netdata/netdata/pull/12562), [@Ferroin](https://github.com/Ferroin)) - Improve netdata-updater logging messages ([#12557](https://github.com/netdata/netdata/pull/12557), [@ilyam8](https://github.com/ilyam8)) - Fix building on MacOS ([#12554](https://github.com/netdata/netdata/pull/12554), [@underhood](https://github.com/underhood)) - Fix FreeBSD bundled protobuf build if system one is present ([#12552](https://github.com/netdata/netdata/pull/12552), [@underhood](https://github.com/underhood)) - Add '--reinstall-clean' flag to kickstart ([#12548](https://github.com/netdata/netdata/pull/12548), [@maneamarius](https://github.com/maneamarius)) - Fix enabling netdata.service during installation on Debian/Ubuntu ([#12542](https://github.com/netdata/netdata/pull/12542), [@ralphm](https://github.com/ralphm)) - Upgrade protocol buffer version to 3.19.4 ([#12537](https://github.com/netdata/netdata/pull/12537), [@surajnpn](https://github.com/surajnpn)) - Remove using non-default values for CPU scheduling policy/OOM score in native packages ([#12529](https://github.com/netdata/netdata/pull/12529), [@ilyam8](https://github.com/ilyam8)) - Fix enabling auto-updates in kickstart when the script is run as a normal user ([#12526](https://github.com/netdata/netdata/pull/12526), [@ilyam8](https://github.com/ilyam8)) - Fix netdata-updater script for Debian packages ([#12524](https://github.com/netdata/netdata/pull/12524), [@ilyam8](https://github.com/ilyam8)) - Fix importing Gpg Keys issue on Centos when installing Netdata in interactive mode ([#12519](https://github.com/netdata/netdata/pull/12519), [@maneamarius](https://github.com/maneamarius)) - Fix importing Gpg Keys issue on Centos7 when installing Netdata in interactive mode ([#12506](https://github.com/netdata/netdata/pull/12506), [@maneamarius](https://github.com/maneamarius)) - Skip running the updater in kickstart dry-run mode ([#12497](https://github.com/netdata/netdata/pull/12497), [@Ferroin](https://github.com/Ferroin)) - Add '--force-update' parameter to netdata-updater ([#12493](https://github.com/netdata/netdata/pull/12493), [@ilyam8](https://github.com/ilyam8)) - Fix enabling auto-updates in the netdata-updater.sh script ([#12491](https://github.com/netdata/netdata/pull/12491), [@ilyam8](https://github.com/ilyam8)) - Bump the debhelper compat level to 10 in our DEB packaging code. ([#12488](https://github.com/netdata/netdata/pull/12488), [@Ferroin](https://github.com/Ferroin)) - Recognize Almalinux as an RHEL clone ([#12487](https://github.com/netdata/netdata/pull/12487), [@Ferroin](https://github.com/Ferroin)) - Update static build components to latest versions ([#12461](https://github.com/netdata/netdata/pull/12461), [@ktsaou](https://github.com/ktsaou)) - Add support for passing extra claiming options when claiming with Docker ([#12457](https://github.com/netdata/netdata/pull/12457), [@Ferroin](https://github.com/Ferroin)) - Fix detection of install type when static or build installation was performed on a native-supported platform ([#12438](https://github.com/netdata/netdata/pull/12438), [@maneamarius](https://github.com/maneamarius)) - Fix checksum validation error when installing on BSD systems ([#12429](https://github.com/netdata/netdata/pull/12429), [@ilyam8](https://github.com/ilyam8)) - Lowercase uuidgen value in the netdata-claim script ([#12422](https://github.com/netdata/netdata/pull/12422), [@ilyam8](https://github.com/ilyam8)) - Add a delay between starting Netdata and checking pids ([#12420](https://github.com/netdata/netdata/pull/12420), [@ilyam8](https://github.com/ilyam8)) - Allow updates without environment files in some cases ([#12400](https://github.com/netdata/netdata/pull/12400), [@Ferroin](https://github.com/Ferroin)) - Reorder functions properly in updater script ([#12399](https://github.com/netdata/netdata/pull/12399), [@Ferroin](https://github.com/Ferroin)) - Fix shellcheck warnings in Docker run.sh ([#12377](https://github.com/netdata/netdata/pull/12377), [@ilyam8](https://github.com/ilyam8)) - Fix handling of checks for newer updater script on update ([#12367](https://github.com/netdata/netdata/pull/12367), [@Ferroin](https://github.com/Ferroin)) - Unconditionally link against libatomic ([#12366](https://github.com/netdata/netdata/pull/12366), [@AlexGhiti](https://github.com/AlexGhiti)) - Redirect dependency handling script output to logfile when running from the updater ([#12341](https://github.com/netdata/netdata/pull/12341), [@Ferroin](https://github.com/Ferroin)) - Use default "bind to" in native packages ([#12336](https://github.com/netdata/netdata/pull/12336), [@ilyam8](https://github.com/ilyam8)) - Use the built agent version for Netdata static build archive name ([#12335](https://github.com/netdata/netdata/pull/12335), [@Ferroin](https://github.com/Ferroin)) - Set repo priority in YUM/DNF repository configuration ([#12332](https://github.com/netdata/netdata/pull/12332), [@Ferroin](https://github.com/Ferroin)) - Add a dry run mode to the kickstart script ([#12322](https://github.com/netdata/netdata/pull/12322), [@Ferroin](https://github.com/Ferroin)) - Provide better handling of config files in Docker containers ([#12310](https://github.com/netdata/netdata/pull/12310), [@Ferroin](https://github.com/Ferroin)) - Fix uninstall using kickstart flag ([#12304](https://github.com/netdata/netdata/pull/12304), [@maneamarius](https://github.com/maneamarius)) - Fixing writing to stderr on success when testing tmpdir in updater ([#12298](https://github.com/netdata/netdata/pull/12298), [@godismyjudge95](https://github.com/godismyjudge95)) - Switch to using netdata-updater.sh to toggle auto updates on and off when installing ([#12296](https://github.com/netdata/netdata/pull/12296), [@Ferroin](https://github.com/Ferroin)) - Pull in build dependencies when updating a locally built install ([#12294](https://github.com/netdata/netdata/pull/12294), [@Ferroin](https://github.com/Ferroin)) - Fix setting of claiming extra parameters in kickstart ([#12289](https://github.com/netdata/netdata/pull/12289), [@ilyam8](https://github.com/ilyam8)) - Fix incorrect install-type on some older nightly installs ([#12282](https://github.com/netdata/netdata/pull/12282), [@Ferroin](https://github.com/Ferroin)) - Add proper handling for legacy kickstart install detection ([#12273](https://github.com/netdata/netdata/pull/12273), [@Ferroin](https://github.com/Ferroin)) - Revise claiming error message in kickstart script ([#12248](https://github.com/netdata/netdata/pull/12248), [@Ferroin](https://github.com/Ferroin)) - Fix libc detection when installing eBPF plugin ([#12242](https://github.com/netdata/netdata/pull/12242), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix license URL ([#12219](https://github.com/netdata/netdata/pull/12219), [@Daniel15](https://github.com/Daniel15)) - Add support to the updater to toggle auto-updates on and off ([#12202](https://github.com/netdata/netdata/pull/12202), [@Ferroin](https://github.com/Ferroin)) - Fix detection of existing installs in kickstart ([#12199](https://github.com/netdata/netdata/pull/12199), [@Ferroin](https://github.com/Ferroin)) - Make netdata-uninstaller.sh POSIX compatibility and add --uninstall flag ([#12195](https://github.com/netdata/netdata/pull/12195), [@maneamarius](https://github.com/maneamarius)) - Add warning about broken Docker hosts in container entrypoint ([#12175](https://github.com/netdata/netdata/pull/12175), [@Ferroin](https://github.com/Ferroin)) - Tidy up the installer script usage message ([#12171](https://github.com/netdata/netdata/pull/12171), [@petecooper](https://github.com/petecooper)) - Bundle protobuf on CentOS 7 and earlier ([#12167](https://github.com/netdata/netdata/pull/12167), [@Ferroin](https://github.com/Ferroin)) - Fix parsing of claiming extra parameters in kickstart ([#12148](https://github.com/netdata/netdata/pull/12148), [@fayak](https://github.com/fayak)) - Improve messaging around unknown install handling in kickstart script ([#12134](https://github.com/netdata/netdata/pull/12134), [@Ferroin](https://github.com/Ferroin)) - Rename DO_NOT_TRACK to DISABLE_TELEMETRY ([#12126](https://github.com/netdata/netdata/pull/12126), [@ilyam8](https://github.com/ilyam8)) - Overhaul handling of auto-updates in the installer code ([#12076](https://github.com/netdata/netdata/pull/12076), [@Ferroin](https://github.com/Ferroin)) - Add handling for claiming non-standard install types with kickstart ([#12064](https://github.com/netdata/netdata/pull/12064), [@Ferroin](https://github.com/Ferroin)) - Add '--no-same-owner' to 'tar xf' in installer ([#11940](https://github.com/netdata/netdata/pull/11940), [@cimnine](https://github.com/cimnine)) - Update netdata-service CapabilityBoundingSet to fix collectors using sudo ([#10201](https://github.com/netdata/netdata/pull/10201), [@tycho](https://github.com/tycho)) </details> ## Documentation 📄 Keeping our documentation healthy together with our awesome community. <details><summary>See all pull requests</summary> - Update the streaming docs to reflect the default settings for stream compression ([#12669](https://github.com/netdata/netdata/pull/12669), [@odynik](https://github.com/odynik)) - Add missing configuration options to ML readme file ([#12575](https://github.com/netdata/netdata/pull/12575), [@andrewm4894](https://github.com/andrewm4894)) - Update anonymous-statistics readme for PostHog Cloud ([#12571](https://github.com/netdata/netdata/pull/12571), [@andrewm4894](https://github.com/andrewm4894)) - Fix unresolved file references ([#12528](https://github.com/netdata/netdata/pull/12528), [@ilyam8](https://github.com/ilyam8)) - Improve eBPF documentation ([#12503](https://github.com/netdata/netdata/pull/12503), [@thiagoftsm](https://github.com/thiagoftsm)) - Improve timex plugin documentation and dashboard info description ([#12495](https://github.com/netdata/netdata/pull/12495), [@DanTheMediocre](https://github.com/DanTheMediocre)) - Remove mention of py2/py3 compatibility from python plugin readme ([#12465](https://github.com/netdata/netdata/pull/12465), [@ilyam8](https://github.com/ilyam8)) - Fix broken link in streaming docs ([#12428](https://github.com/netdata/netdata/pull/12428), [@kickoke](https://github.com/kickoke)) - Add content for eBPF docs ([#12417](https://github.com/netdata/netdata/pull/12417), [@thiagoftsm](https://github.com/thiagoftsm)) - Add link to Netdata badges readme in health docs ([#12412](https://github.com/netdata/netdata/pull/12412), [@Steve8291](https://github.com/Steve8291)) - Add missing frontmatter in ML docs ([#12353](https://github.com/netdata/netdata/pull/12353), [@kickoke](https://github.com/kickoke)) - Fix libnetfilter-acct-dev package name in nfacct plugin docs ([#12326](https://github.com/netdata/netdata/pull/12326), [@Steve8291](https://github.com/Steve8291)) - Fix a typo in anomaly-detection-python.md ([#12317](https://github.com/netdata/netdata/pull/12317), [@DanTheMediocre](https://github.com/DanTheMediocre)) - Add ML notebooks ([#12313](https://github.com/netdata/netdata/pull/12313), [@andrewm4894](https://github.com/andrewm4894)) - Fix typos in Running-behind-haproxy.md ([#12272](https://github.com/netdata/netdata/pull/12272), [@RatishT](https://github.com/RatishT)) - Fix a typo in the step-by-step Netdata guide ([#12263](https://github.com/netdata/netdata/pull/12263), [@tnagorran](https://github.com/tnagorran)) - Fix a typo in anomaly-detection-python.md file ([#12220](https://github.com/netdata/netdata/pull/12220), [@MariosMarinos](https://github.com/MariosMarinos)) - Fix typos in documentation ([#12208](https://github.com/netdata/netdata/pull/12208), [@xrgman](https://github.com/xrgman)) - Add a note about known issues on older hosts with seccomp enabled and claiming ([#12192](https://github.com/netdata/netdata/pull/12192), [@ilyam8](https://github.com/ilyam8)) - Fix unresolved file references ([#12191](https://github.com/netdata/netdata/pull/12191), [@ilyam8](https://github.com/ilyam8)) - Fix various typos ([#12183](https://github.com/netdata/netdata/pull/12183), [@rex4539](https://github.com/rex4539)) - Fix claiming command in the kickstart readme ([#12161](https://github.com/netdata/netdata/pull/12161), [@kickoke](https://github.com/kickoke)) - Remove Google Analytics from the docs ([#12145](https://github.com/netdata/netdata/pull/12145), [@kickoke](https://github.com/kickoke)) - Improve kickstart cloud installation docs ([#12143](https://github.com/netdata/netdata/pull/12143), [@kickoke](https://github.com/kickoke)) - Fixed broken links ([#12142](https://github.com/netdata/netdata/pull/12142), [@kickoke](https://github.com/kickoke)) - Improve Amazon SNS notification method readme ([#11946](https://github.com/netdata/netdata/pull/11946), [@kickoke](https://github.com/kickoke)) </details> ## Other notable changes ### Improvements ⚙️ Greasing the gears to smoothen your experience with Netdata. <details><summary>See all pull requests</summary> - Add a chart label filter parameter in context data queries ([#12652](https://github.com/netdata/netdata/pull/12652), [@stelfrag](https://github.com/stelfrag)) - Add a timeout parameter to data queries ([#12649](https://github.com/netdata/netdata/pull/12649), [@stelfrag](https://github.com/stelfrag)) - Add k8s cluster name to host labels (GKE only) ([#12638](https://github.com/netdata/netdata/pull/12638), [@ilyam8](https://github.com/ilyam8)) - Add cloud providers info to host labels and /api/v1/info ([#12613](https://github.com/netdata/netdata/pull/12613), [@ilyam8](https://github.com/ilyam8)) - Reduce logging on child reconnect ([#12594](https://github.com/netdata/netdata/pull/12594), [@ilyam8](https://github.com/ilyam8)) - Improve ACLK sync logging ([#12534](https://github.com/netdata/netdata/pull/12534), [@stelfrag](https://github.com/stelfrag)) - Add more info to netdatacli 'aclk-state' ([#12458](https://github.com/netdata/netdata/pull/12458), [@underhood](https://github.com/underhood)) - Remove "web files" options leftovers ([#12403](https://github.com/netdata/netdata/pull/12403), [@ilyam8](https://github.com/ilyam8)) - Improve agent to cloud synchronization performance ([#12348](https://github.com/netdata/netdata/pull/12348), [@stelfrag](https://github.com/stelfrag)) - Remove owner check from webserver ([#12339](https://github.com/netdata/netdata/pull/12339), [@thiagoftsm](https://github.com/thiagoftsm)) - Change default OOM score and scheduling policy to behave more sanely ([#12271](https://github.com/netdata/netdata/pull/12271), [@Ferroin](https://github.com/Ferroin)) - Add more info to aclk-state API call ([#12231](https://github.com/netdata/netdata/pull/12231), [@underhood](https://github.com/underhood)) - Add -W keepopenfds option ([#12211](https://github.com/netdata/netdata/pull/12211), [@vkalintiris](https://github.com/vkalintiris)) - Remove chart specific configuration from netdata.conf except enabled ([#12209](https://github.com/netdata/netdata/pull/12209), [@stelfrag](https://github.com/stelfrag)) - Improve cleaning up of orphan hosts ([#12201](https://github.com/netdata/netdata/pull/12201), [@stelfrag](https://github.com/stelfrag)) - Add install method to /api/v1/info as label ([#12040](https://github.com/netdata/netdata/pull/12040), [@underhood](https://github.com/underhood)) - Add all query types to aclk_processed_query_type ([#12036](https://github.com/netdata/netdata/pull/12036), [@underhood](https://github.com/underhood)) - Create a removed alert event if chart goes obsolete ([#12021](https://github.com/netdata/netdata/pull/12021), [@MrZammler](https://github.com/MrZammler)) - Add chart for incoming proto msgs in new cloud protocol ([#11969](https://github.com/netdata/netdata/pull/11969), [@underhood](https://github.com/underhood)) </details> ### Bug fixes 🐞 Increasing Netdata's reliability one bug fix at a time. <details><summary>See all pull requests</summary> - Fix deadlock when deleting a child instance host and ML training is running ([#12681](https://github.com/netdata/netdata/pull/12681), [@vkalintiris](https://github.com/vkalintiris)) - Fix Netdata crash during anomaly calculation ([#12672](https://github.com/netdata/netdata/pull/12672), [@vkalintiris](https://github.com/vkalintiris)) - Fix not clean ACLK shutdown when agent is shutting down ([#12625](https://github.com/netdata/netdata/pull/12625), [@underhood](https://github.com/underhood)) - Fix shutting down the agent when the creation of the management API key file failed ([#12623](https://github.com/netdata/netdata/pull/12623), [@MrZammler](https://github.com/MrZammler)) - Fix respecting of dimension hidden option when executing a query ([#12570](https://github.com/netdata/netdata/pull/12570), [@stelfrag](https://github.com/stelfrag)) - Fix Agent crash on api/v1/info call ([#12565](https://github.com/netdata/netdata/pull/12565), [@erdem2000](https://github.com/erdem2000)) - Fix CPU frequency detection in system-info.sh ([#12550](https://github.com/netdata/netdata/pull/12550), [@ilyam8](https://github.com/ilyam8)) - Fix sending alert events with missing timezone data ([#12547](https://github.com/netdata/netdata/pull/12547), [@MrZammler](https://github.com/MrZammler)) - Fix invalid pointer reference when executing agent CLI commands ([#12540](https://github.com/netdata/netdata/pull/12540), [@stelfrag](https://github.com/stelfrag)) - Fix memory leaks on Netdata exit ([#12511](https://github.com/netdata/netdata/pull/12511), [@vlvkobal](https://github.com/vlvkobal)) - Fix wrong 'metrics-count' in /api/v1/info ([#12504](https://github.com/netdata/netdata/pull/12504), [@vkalintiris](https://github.com/vkalintiris)) - Fix issue with charts not properly synchronized with the cloud ([#12451](https://github.com/netdata/netdata/pull/12451), [@stelfrag](https://github.com/stelfrag)) - Fix high CPU usage for unclaimed agents ([#12449](https://github.com/netdata/netdata/pull/12449), [@underhood](https://github.com/underhood)) - Fix CPU frequency detection of FreeBSD ([#12440](https://github.com/netdata/netdata/pull/12440), [@ilyam8](https://github.com/ilyam8)) - Fix a case when claim_id is sent in uppercase ([#12423](https://github.com/netdata/netdata/pull/12423), [@underhood](https://github.com/underhood)) - Fix crash when netdatacli command output too long ([#12393](https://github.com/netdata/netdata/pull/12393), [@underhood](https://github.com/underhood)) - Fix Netdata crash on ACLK alerts streaming ([#12392](https://github.com/netdata/netdata/pull/12392), [@MrZammler](https://github.com/MrZammler)) - Fix build info output when dbengine is not compiled ([#12354](https://github.com/netdata/netdata/pull/12354), [@underhood](https://github.com/underhood)) - Fix container virtualization detection with systemd-detect-virt ([#12338](https://github.com/netdata/netdata/pull/12338), [@ilyam8](https://github.com/ilyam8)) - Fix returning 0 for unknown CPU frequency in system-info.sh ([#12323](https://github.com/netdata/netdata/pull/12323), [@ilyam8](https://github.com/ilyam8)) - Fix CPU frequency detection for containers ([#12306](https://github.com/netdata/netdata/pull/12306), [@ilyam8](https://github.com/ilyam8)) - Fix CPU info detection on macOS ([#12293](https://github.com/netdata/netdata/pull/12293), [@ilyam8](https://github.com/ilyam8)) - Fix long timeouts on the cloud because the agent does not respond for failed queries with a failed message ([#12277](https://github.com/netdata/netdata/pull/12277), [@underhood](https://github.com/underhood)) - Fix registration of child nodes in the cloud through the parent ([#12241](https://github.com/netdata/netdata/pull/12241), [@stelfrag](https://github.com/stelfrag)) - Fix node information send to the cloud for older agent versions ([#12223](https://github.com/netdata/netdata/pull/12223), [@stelfrag](https://github.com/stelfrag)) - Fix Netdata crash on ACLK alerts streaming when 'info' field is missing ([#12210](https://github.com/netdata/netdata/pull/12210), [@MrZammler](https://github.com/MrZammler)) - Fix claiming with wget ([#12163](https://github.com/netdata/netdata/pull/12163), [@ilyam8](https://github.com/ilyam8)) - Fix CPU frequency calculation in system-info.sh ([#12162](https://github.com/netdata/netdata/pull/12162), [@ilyam8](https://github.com/ilyam8)) - Fix data query option allow_past to correctly work in memory mode ram and save ([#12136](https://github.com/netdata/netdata/pull/12136), [@stelfrag](https://github.com/stelfrag)) - Fix the format=array output in context queries ([#12129](https://github.com/netdata/netdata/pull/12129), [@stelfrag](https://github.com/stelfrag)) - Fix Netdata crash when there are charts with ids which differ only by symbols that are not '_' or alphanumeric and no unique names are provided ([#12067](https://github.com/netdata/netdata/pull/12067), [@vlvkobal](https://github.com/vlvkobal)) </details> ### Code organization 🏋️ Changes to keep our code base in good shape. <details><summary>See all pull requests</summary> - Fix a compilation warning ([#12608](https://github.com/netdata/netdata/pull/12608), [@vlvkobal](https://github.com/vlvkobal)) - Make sure registered static threads are unique ([#12538](https://github.com/netdata/netdata/pull/12538), [@vkalintiris](https://github.com/vkalintiris)) - Fix configure output of eBPF plugin ([#12471](https://github.com/netdata/netdata/pull/12471), [@underhood](https://github.com/underhood)) - Don't send an alert snapshot with snapshot_id 0 ([#12469](https://github.com/netdata/netdata/pull/12469), [@MrZammler](https://github.com/MrZammler)) - Implement fine-grained errors to cloud queries ([#12460](https://github.com/netdata/netdata/pull/12460), [@underhood](https://github.com/underhood)) - Initialize foreach alarms of dimensions in health thread ([#12452](https://github.com/netdata/netdata/pull/12452), [@vkalintiris](https://github.com/vkalintiris)) - Add delay on missing private key ([#12450](https://github.com/netdata/netdata/pull/12450), [@underhood](https://github.com/underhood)) - Update build/m4/ax_pthread.m4 ([#12390](https://github.com/netdata/netdata/pull/12390), [@vkalintiris](https://github.com/vkalintiris)) - Delay removed event for 60 seconds after the chart's last collected time ([#12388](https://github.com/netdata/netdata/pull/12388), [@MrZammler](https://github.com/MrZammler)) - Update Agent version in the Swagger API ([#12374](https://github.com/netdata/netdata/pull/12374), [@tkatsoulas](https://github.com/tkatsoulas)) - Replace write with read locks ([#12309](https://github.com/netdata/netdata/pull/12309), [@MrZammler](https://github.com/MrZammler)) - Add node_id into mirrored_hosts list ([#12307](https://github.com/netdata/netdata/pull/12307), [@underhood](https://github.com/underhood)) - Remove unused variable in the system-info script ([#12297](https://github.com/netdata/netdata/pull/12297), [@ilyam8](https://github.com/ilyam8)) - Only store alert hashes when iterated from localhost ([#12292](https://github.com/netdata/netdata/pull/12292), [@MrZammler](https://github.com/MrZammler)) - Adjust cloud dimension update frequency ([#12284](https://github.com/netdata/netdata/pull/12284), [@stelfrag](https://github.com/stelfrag)) - Add host labels _aclk_ng_new_cloud_protocol ([#12278](https://github.com/netdata/netdata/pull/12278), [@underhood](https://github.com/underhood)) - Null terminate decoded_query_string if there are no url parameters. ([#12266](https://github.com/netdata/netdata/pull/12266), [@MrZammler](https://github.com/MrZammler)) - Set a version number for the metadata database to better handle future data migrations ([#12249](https://github.com/netdata/netdata/pull/12249), [@stelfrag](https://github.com/stelfrag)) - Fix builds where HAVE_C___ATOMIC is not defined. ([#12240](https://github.com/netdata/netdata/pull/12240), [@vkalintiris](https://github.com/vkalintiris)) - Remove unused code ([#12230](https://github.com/netdata/netdata/pull/12230), [@underhood](https://github.com/underhood)) - Store dimension hidden option in the metadata db ([#12196](https://github.com/netdata/netdata/pull/12196), [@stelfrag](https://github.com/stelfrag)) - Remove check for ACLK_NG and PROMETHEUS_WRITE in order to assume PROTOBUF ([#12168](https://github.com/netdata/netdata/pull/12168), [@MrZammler](https://github.com/MrZammler)) - Fix compilation warnings on macOS ([#12082](https://github.com/netdata/netdata/pull/12082), [@vlvkobal](https://github.com/vlvkobal)) - Remove SIZEOF_VOIDP and ENVIRONMENT{32,64} macros. ([#12046](https://github.com/netdata/netdata/pull/12046), [@vkalintiris](https://github.com/vkalintiris)) - Remove unused NETDATA_NO_ATOMIC_INSTRUCTIONS macro ([#12045](https://github.com/netdata/netdata/pull/12045), [@vkalintiris](https://github.com/vkalintiris)) - Remove NETDATA_WITH_UUID def because it's not used anywhere. ([#12044](https://github.com/netdata/netdata/pull/12044), [@vkalintiris](https://github.com/vkalintiris)) - Inform cloud about inability to satisfy request ([#12041](https://github.com/netdata/netdata/pull/12041), [@underhood](https://github.com/underhood)) - Remove ACLK_NEWARCH_DEVMODE ([#12018](https://github.com/netdata/netdata/pull/12018), [@underhood](https://github.com/underhood)) </details> ## Deprecation notice <a id="deprecation-notice"></a> The following items will be removed in our next minor release (v1.35.0): > Patch releases (if any) will not be affected. | Component | Type | Replaced by | |----------------------------------------------------------------------------------------------------------------------|:-----------:|:-----------------------------------------------------------------------------------------------------------:| | [node.d](https://github.com/netdata/netdata/tree/v1.33.1/collectors/node.d.plugin#nodedplugin) | plugin | - | | [node.d/snmp](https://github.com/netdata/netdata/tree/v1.33.1/collectors/node.d.plugin/snmp) | collector | [go.d/snmp](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/snmp) | | [python.d/apache](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/apache) | collector | [go.d/apache](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache) | | [python.d/couchdb](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/couchdb) | collector | [go.d/couchdb](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/couchdb) | | [python.d/dns_query_time](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/dns_query_time) | collector | [go.d/dnsquery](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/dnsquery) | | [python.d/dnsdist](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/dnsdist) | collector | [go.d/dnsdist](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/dnsdist) | | [python.d/elasticsearch](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/elasticsearch) | collector | [go.d/elasticsearch](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/elasticsearch) | | [python.d/energid](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/energid) | collector | [go.d/energid](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/energid) | | [python.d/freeradius](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/freeradius) | collector | [go.d/freeradius](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/freeradius) | | [python.d/httpcheck](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/httpcheck) | collector | [go.d/httpcheck](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/httpcheck) | | [python.d/isc_dhcpd](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/isc_dhcpd) | collector | [go.d/isc_dhcpd](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/isc_dhcpd) | | [python.d/mysql](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/mysql) | collector | [go.d/mysql](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql) | | [python.d/nginx](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/nginx) | collector | [go.d/nginx](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) | | [python.d/phpfpm](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/phpfpm) | collector | [go.d/phpfpm](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/phpfpm) | | [python.d/portcheck](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/portcheck) | collector | [go.d/portcheck](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/portcheck) | | [python.d/powerdns](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/powerdns) | collector | [go.d/powerdns](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/powerdns) | | [python.d/redis](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/redis) | collector | [go.d/redis](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis) | | [python.d/web_log](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/web_log) | collector | [go.d/weblog](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog) | All the deprecated components will be moved to the [netdata/community](https://github.com/netdata/community) repository. ### Deprecated in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.33.0#deprecation-notice), the following items have been removed in this release: | Component | Type | Replaced by | |----------------------------------------------------------------------------------------------------------------|:---------:|:---------------------------------------------------------------------------------------:| | [backends](https://github.com/netdata/netdata/tree/v1.33.0/backends#metrics-long-term-archiving) | subsystem | [exporting engine](https://learn.netdata.cloud/docs/agent/exporting) | | [node.d/fronius](https://github.com/netdata/netdata/tree/v1.33.0/collectors/node.d.plugin/fronius) | collector | - | | [node.d/sma_webbox](https://github.com/netdata/netdata/tree/v1.33.0/collectors/node.d.plugin/sma_webbox) | collector | - | | [node.d/stiebeleltron](https://github.com/netdata/netdata/tree/v1.33.0/collectors/node.d.plugin/stiebeleltron) | collector | - | | [node.d/named](https://github.com/netdata/netdata/tree/v1.33.0/collectors/node.d.plugin/named) | collector | [go.d/bind](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/bind) | ## Support options Supporting people in using and building with Netdata is very important to us! Should you need any help or encounter an issue with any of the changes made in this release, feel free to get in touch with the community through the following channels: - [GitHub](https://github.com/netdata): Report bugs or submit a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Share your ideas, and be part of the Netdata Agent development process. - [Community forum](https://community.netdata.cloud): Collaborate with other troubleshooters in building a community-driven knowledge base around Netdata. - [Discord](https://discord.gg/2eduZdSeC7): Join us in celebrating the culture of infrastructure monitoring. Hang out with like-minded sysadmins, SREs, and troubleshooters. 2022-04-14T17:57:41+00:00 netdata v1.34.1 netdata v1.34.1 2022-04-15T18:09:07+00:00 This patch release fixes versioning issues that occured in the latest release ([Netdata v1.34](https://github.com/netdata/netdata/releases/tag/1.34.0)): - The release artifacts on the release itself showed a version of v1.33.1-339-g0046735ba instead of v1.34.0 - The binaries for the release, irrespective of the source, also showed the same version. - The Docker images for the release have incorrect image tags that are inconsistent with our previous Docker image tag. - Git tags ended up partially duplicated. ## Support options Supporting people in using and building with Netdata is very important to us! Should you need any help or encounter an issue with any of the changes made in this release, feel free to get in touch with the community through the following channels: - [GitHub](https://github.com/netdata): Report bugs or submit a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Share your ideas, and be part of the Netdata Agent development process. - [Community forum](https://community.netdata.cloud): Collaborate with other troubleshooters in building a community-driven knowledge base around Netdata. - [Discord](https://discord.gg/2eduZdSeC7): Join us in celebrating the culture of infrastructure monitoring. Hang out with like-minded sysadmins, SREs, and troubleshooters. 2022-04-15T18:09:07+00:00 netdata v1.35.0 netdata v1.35.0 2022-06-08T18:51:44+00:00 **Table of contents** - [Release highlights](#v1350-release-highlights) - [Anomaly Advisor & on-device Machine Learning](#v1350-anomaly-advisor-ml) - [Metrics Correlation on Agent](#v1350-metric-correlation-agent) - [Kubernetes monitoring](#v1350-kubernetes-monitoring) - [Visualization improvements](#v1350-visualization-improvements) - [Alerts management](#v1350-alerts-management) - [Nodes management](#v1350-nodes-management) - [StatsD improvements](#v1350-statsd-improvements) - [3x faster agent queries](#v1350-3-times-faster) - [Streaming](#v1350-streaming) - [More optimizations](#v1350-more-optimizations) - [New MQTT Client - Tech Preview](#v1350-new-mqtt-5) - [Acknowledgments](#v1350-ack) - [Contributions](#v1350-contributions) - [Deprecation notice](#v1350-deprecation-notice) - [Netdata Agent release meetup](#v1350-release-meetup) - [Support options](#v1350-support-options) <!-- Remove if there are no deprecations --> > ❗ We're keeping our codebase healthy by removing features that are end of life. Read the [deprecation notice](#v1350-deprecation-notice) to check if you are affected. #### Netdata open-source Agent statistics <a id="v1350-agent-statistics"></a> - 7.6M+ troubleshooters monitor with Netdata - 1.3M+ unique nodes currently live - 3.3k+ new nodes per day - Over 556M Docker pulls all-time total ## Release highlights <a id="v1350-release-highlights"></a> ### Anomaly Advisor & on-device Machine Learning <a id="v1350-anomaly-advisor-ml"></a> We are excited to launch one of our flagship machine learning (ML) assisted troubleshooting features in Netdata: the Anomaly Advisor. Netdata now comes with on-device ML! Unsupervised ML models are trained for every metric, at the edge (on your devices), enabling real time anomaly detection across your infrastructure. ![image](https://user-images.githubusercontent.com/24860547/172659919-2703dba8-d8e8-412f-b47e-59b2a44e621d.png) This feature is part of a broader philosophy we have at Netdata when it comes to how we can leverage ML-based solutions to help augment and assist traditional troubleshooting workflows, without having to centralize all your data. The new Anomalies tab quickly lets you find periods of time with elevated anomaly rates across all of your nodes. Once you highlight a period of interest, Netdata will generate a ranked list of the most anomalous metrics across all nodes in the highlighted timeframe. The goal is to quickly let you find periods of abnormal activity in your infrastructure and bring to your attention the metrics that were most anomalous during that time. In our latest release, we improved the usability of Anomaly Advisor and also ensured that the anomalous metrics are always relevant to the time period you are investigating. A great deal of care has gone into ensuring that ML running on your device is as light weight in terms of resource consumption as possible. For instance, metrics that do not have sufficient data for training and metrics that are consistently constant during training periods are considered to be "normal" until their behavior changes significantly to require re-training of the ML models. To use this feature, please enable ML on your agent and then navigate to the "Anomalies" tab in Netdata cloud. Update `netdata.conf` with the following information to enable ML on your agent: ``` [ml] enabled = yes ``` [Read more about Anomaly Advisor at our blog](https://www.netdata.cloud/blog/introducing-anomaly-advisor-unsupervised-anomaly-detection-in-netdata/?utm_campaign=Campaign_ReleaseNotes_1_35&utm_source=GitHub&utm_medium=GitHub_ReleaseNotes&utm_content=direct_link_netdata_blog). ### Metrics Correlation on Agent <a id="v1350-metric-correlation-agent"></a> Metric Correlations allow you to quickly find metrics and charts related to a particular window of interest that you want to explore further. Metric correlations compare two adjacent windows to find how they relate to each other, and then score all metrics based on this rating, providing a list of metrics that may have influence or have been influenced by the highlighted one. Metric Correlation was already available in Netdata Cloud, but now we are releasing a version implemented at the Netdata Agent, which drastically reduces the time required for to run. This means the metric correlation can now run almost **instantly** (more than 10x faster than before)! To enable the new metric correlation at the Netdata Agent, set the following in your `netdata.conf` file: ``` [global] enable metric correlations = yes ``` ### Kubernetes monitoring <a id="v1350-kubernetes-monitoring"></a> On very busy Kubernetes clusters where hundreds of containers spawn and are destroyed all the time, Netdata was consuming a lot of resources and was slow to detect changes and under certain conditions it missed certain containers. Now, Netdata: 1. Detects "pause" containers and skips them greatly improving the performance during discovery 2. Detects containers that are initializing and postpones discovery for them until they are properly initialized 3. Utilizes less resources more efficiently during container discovery Netdata is also capable of detecting the network interfaces that have been allocated to containers, by spawning a process that switches network namespace and identifies virtual interfaces that belong to each container. This process is improved drastically, now requiring 1/3 of the CPU resources it needed before. Additionally, Netdata `cgroups.plugin` now collects CPU shares for Kubernetes containers, allowing the visualization of the Kubernetes CPU Requests (Kubernetes writes in cgroup CPU Shares the CPU Requests that have been configured for the containers). A new option has been added in `netdata.conf` [`plugin:cgroup]` section, to allow filtering containers by (resolved) name. It matches the name of the cgroup (as you see it on the dashboard). We have also released a blog post and a video about CPU Throttling in Kubernetes. You will be amazed by our findings. [Read the blog and watch the video about Kubernetes CPU throttling](https://www.netdata.cloud/blog/kubernetes-throttling-doesnt-have-to-suck-let-us-help/?utm_campaign=Campaign_ReleaseNotes_1_35&utm_source=GitHub&utm_medium=GitHub_ReleaseNotes&utm_content=direct_link_netdata_blog). ### Visualization improvements <a id="v1350-visualization-improvements"></a> Netdata Cloud dashboards are now a lot faster in aggregating data from multiple agents, as the protocol between agents and the Cloud is approaching its final shape. #### New look for Netdata charts Netdata Cloud has a new look and feel for charts, which resembles the look and feel for coding IDEs: ![image](https://user-images.githubusercontent.com/2662304/172681362-c63a58a2-d2bd-4bcd-b11e-ecb4364d1b72.png) #### New home for war rooms <a id="v1350-visualization-improvements-home-tab"></a> The new home tab for war rooms allows you to quickly inspect the most important metrics for every war room, like number of nodes, metrics, retention, replication, alerts, users, custom dashboards, etc. ![](https://user-images.githubusercontent.com/12612986/172675676-e62e9d1d-55c0-41b8-8b3d-735987c2f963.png) #### Time units <a id="v1350-visualization-improvements-time-units"></a> Time units now in charts auto-scale from microseconds to days, automatically based on the value of time to be shown. #### Cloud queries timeout <a id="v1350-visualization-improvements-cloud-queris"></a> The agent now sets a timeout on every query it sends to the agents, and the agents now respect this timeout. Previously, the cloud was timing out because of a slow query, but the agents remained busy executing that query, which had a waterfall effect on the agent load. #### Custom dashboards <a id="v1350-visualization-improvements-custom-dashboards"></a> Custom dashboards on Netdata Cloud can now be renamed. ### Alerts management <a id="v1350-alerts-management"></a> #### All configured alerts on the Cloud <a id="v1350-alerts-management-sub-tab"></a> We have added a new Alert Configs sub tab which lists all the alerts configured on all the nodes belonging to the war room. You have now a possibility of listing the alerts configured in the - war room, nodes and alert instances respectively. #### Stale alerts <a id="v1350-alerts-management-stale-alerts"></a> There have been a number of corner cases under which alerts could remain raised on Netdata cloud. We identified all such cases, and now Netdata Cloud is always in sync with Netdata agents about their alerts. ### Nodes management <a id="v1350-nodes-management"></a> #### Cloud provider metadata <a id="v1350-nodes-management-cloud-provider-metdata"></a> Netdata now identifies the Cloud provider node type it runs on. It works for GCP and AWS, and exposes this information at the Nodes tab, the single node dashboard, and the node inspector. ![](https://user-images.githubusercontent.com/12612986/172683505-453f45ba-bd85-4849-9a56-5268cedb53d6.png) #### Virtualization detection fixes <a id="v1350-nodes-management-virt-detection"></a> We improved the virtualization detection in cases where systemd is not available. Now Netdata can properly detect virtualization even in these cases. #### Global nodes filter on all tabs of a space <a id="v1350-nodes-management-global-nodes-filter"></a> The new Netdata Cloud now supports a global filter on nodes of war rooms. The new filter is applied on every tab for each room, allowing users to quickly switch between tabs while retaining the nodes filtered. ![](https://user-images.githubusercontent.com/12612986/172674440-df224058-2b2c-41da-bb45-f4eb82e342e5.png) #### Obsoletion of nodes <a id="v1350-obsoletion-of-nodes"></a> <a id="v1350-nodes-management-node-obsoletion"></a> Netdata admin users now have the ability to remove obsolete nodes from a space. Many users have been eagerly waiting for this feature, and we thank you for your patience. We hope you will be happy to use the feature and have cleaner spaces and war rooms. A few notes to be considered: - Only admin users have the ability to obsolete nodes - Only offline nodes can be marked obsolete (Live nodes and stale nodes cannot be obsoleted) - Node obsoletion works across the entire space, so the obsoleted node will be removed from all rooms belonging to the space - If the obsoleted nodes eventually become live or online once more, they will be automatically re-added to the space ### StatsD improvements <a id="v1350-statsd-improvements"></a> Every Netdata Agent is a StatsD server, listening on localhost port 8125, both TCP and UDP. You can use the Netdata StatsD server to quickly visualize metrics from scripts, Cron Job, and local applications. In this release, the Netdata StatsD server has been improved to use Judy arrays for indexing the collected metrics, drastically improving its performance. At the same time we extended the StatsD protocol to support `dictionaries` . Dictionaries are similar to `sets`, but instead of reporting only the number of unique entries in the `set`, `dictionaries` create a counter for each of the values and report the number of occurrences for each unique event. So, to quickly get a break down of events, you can push them to StatsD like `myapp.metric:EVENT|d`. StatsD will create a chart for `myapp.metric` and for each unique `EVENT` it will create a dimension with the number of times this events was encountered. We also added the ability to change the units of the chart and the family of the chart, using StatsD tags, like this: `myapp.metric:EVENT|d|#units=events/s`. Finally, StatsD now automatically creates a dashboard section for every StatsD application name. Following StatsD best practices, these application names are considered to be the first keyword of collected metrics. For example, by pushing the metric `myapp.metric:1|c`, StatsD will create the dashboard section "StatsD myapp". Read more at the [Netdata StatsD documentation](https://learn.netdata.cloud/docs/agent/collectors/statsd.plugin?utm_campaign=Campaign_ReleaseNotes_1_35&utm_source=GitHub&utm_medium=GitHub_ReleaseNotes&utm_content=direct_link_docs). A real-life example of using Netdata StatsD from a shell script pushing in realtime metric to a local Netdata Agent, is available at this [stress-with-curl.sh gist](https://gist.github.com/ktsaou/df6556f0702f5263a3f82f6cfd9f5f30). ### 3x faster agent queries <a id="v1350-3-times-faster"></a> Netdata dashboards refresh all visible charts in parallel, utilizing all the resources the web browsers provide to quickly present the required charts. Since Netdata only stores metric data at the agents, all these queries are executed in parallel at the agents. This parallelism of queries is even more intense when metrics replication/streaming is configured. In these cases, parent Netdata agents centralize metric data from many agents, and, since Netdata Cloud prefers the more distant parents for queries, they receive quite a few queries in parallel for all their children. We also reworked many parts of the query engine of Netdata agents to achieve top performance in parallel queries. Now, Netdata agents are able to perform queries at a rate of more than **30 million points per second, per core** on modern hardware. On a parent Netdata agent with a 24-core CPU we observed a sustained rate of **1.3 billion points per second**! This is 3 times faster compared to the previous release. To achieve this performance improvements we worked in these areas: #### Query memory management <a id="v1350-3-times-faster-memory-management"></a> When querying metric data, a lot of memory allocations need to happen. Although Netdata agents automatically adapt their memory requirements for data collection avoiding memory operations while iterating to collect data, unfortunately at the query engine site, this is not feasible. To make the agent more efficient for queries, the number of system calls allocating memory had to be drastically decreased. So, we developed a `One Way Allocator` (`OWA`), a system that works like a scratchpad for memory allocations. When the query starts, we now predict the amount of memory needed to execute the query. The query engine still does all the individual allocations, but all these are now made against the scratchpad, not against the system. `OWA` is smart enough to increase the size of the scratchpad if needed during querying. And it frees all memory at once without the need for individual memory releases. For huge data queries, the benefit is astonishing. For certain heavy data queries, 45000 memory allocations before are down to 20 with this release! This doubled the performance of the query engine. #### Number unpacking <a id="v1350-3-times-faster-number-unpacking"></a> To optimize its memory footprint for metric data, Netdata agents store collected metric data into a fixed step database (after interpolation) with a custom floating point number format we developed (we call it `storage_number`), requiring just 4 bytes per data collection point, including the timestamp. When on disk, mainly due to compression, Netdata's dbengine needs just 0.34 bytes per point (including all metadata), which is probably the best among all monitoring solutions available today, allowing Netdata to massively store and manage metric data at a very high rate. This means however, that in order to actually use a point in a query, we have to **unpack it**. This unpacking happens point-by-point even for data cached in memory. 1 billion points in a data query, 1 billion numbers unpacked. In this release we analyzed the CPU cache efficiency of the number unpacking and we refactored it to make the best use of available CPU caches to finally increase its performance by 30%. ### Streaming <a id="v1350-streaming"></a> This release includes a better algorithm to pick the available parent to stream metrics to. The previous version was always reconnecting to the first available parent. Now it rotates them, one by one and then restarts. An issue was fixed regarding parents with stale alerts from disconnected children. Now, the parent validates all alerts on every child re-connection. Netdata parents now have a timeout to cleanup dead/abandoned children connections automatically. We also worked to eliminate most of the bottlenecks when multiple children connect to the same parent. But this is still under testing, so it will make it in the next release. ### More optimizations <a id="v1350-more-optimizations"></a> #### Workers optimizations <a id="v1350-more-optimizations-workers"></a> Netdata uses many workers to execute several of its features. There are web workers, aclk workers, dbengine workers, health monitoring workers, libuv workers, and many more. We manage to identify a lot of deadlocks happening that slowed down the whole operation. We also increased the amount of workers to deliver more capacity on busy parents. There is a new section for monitoring Netdata workers at the "Netdata Monitoring" section of the dashboard. Using this work we are still working to make them even more efficient. #### Deadlocks <a id="v1350-more-optimizations-deadlocks"></a> The last release was hindered by rare deadlocks on very busy parents. These deadlocks are now gone, improving the agents ability to centralize data from many children. #### Dictionaries are now using Judy arrays <a id="v1350-more-optimizations-judy-arrays"></a> Judy arrays are probably the fastest and most CPU cache-friendly indexes available. Netdata already uses them for dbengine and its page cache. Now all Netdata dictionaries are using them too, giving a performance boost to all dictionary operations, including StatsD. #### /proc collectors are now a lot faster <a id="v1350-more-optimizations-procs-collector"></a>> Initialization of `/proc` collectors was suboptimal, because they had to go over a slow process or adapting their read buffers. We added a forward-looking algorithm to optimize this initialization, which now happens in 1/10th of the time. #### /proc/netdev collector is now isolated <a id="v1350-more-optimizations-netdev-collector"></a> Some users have experiences gaps in `/proc` plugin charts. We identified that these gaps were triggered by the `netdev` module, which were cause the whole plugin to slow down and miss data collection iterations. Now the `netdev` module of `/proc` plugin runs on its own thread to avoid this influencing the rest of the `/proc` modules. #### Internal Web Server optimizations <a id="v1350-more-optimizations-internal-web-server"></a> The internal web server of Netdata now spreads the work among its worker threads more evenly, utilizing as much of the parallelism that is available to it. #### Options in `netdata.conf` re-organized <a id="v1350-more-optimizations-options-netdataconf"></a> We re-organized the `[global]` section of the `netdata.conf`, so that it is more meaningful for new users. The new configurations are backward compatible. So, after you restart netdata with your old `netdata.conf`, grab the new one from `http://localhost:19999/netdata.conf` to have the new format. ### New MQTT Client - Tech Preview <a id="v1350-new-mqtt-5"></a> We now have our own MQTT implementation within our ACLK protocol that will eventually replace the current MQTT-C client for several reasons, including the following: * With the new MQTT implementation we now support MQTTv5 as our older implementation only supported MQTTv3 * Reduce memory usage - no need for large fixed size buffers to be allocated all the time * Reduce memory copying - no need to copy message contents multiple times * Remove max message size limit * Remove issues where big messages are starving other messages Currently, it’s provided as a tech preview, and it’s disabled by default. Feel free to have some fun with the new implementation. This is how to enable it in `netdata.conf`: ``` [cloud] mqtt5 = yes ``` ## Acknowledgments <a id="v1350-ack"></a> - [@JaphethLim](https://github.com/JaphethLim) for adding priority to Gotify notifications. - [@MarianSavchuk](https://github.com/MarianSavchuk) for adding Alma and Rocky distros as CentOS compatibility distro in netdata-updater. - [@aberaud](https://github.com/aberaud) for working on configurable storage engine. - [@atriwidada](https://github.com/atriwidada) for improving package dependency. - [@coffeegrind123](https://github.com/coffeegrind123) for adding Gotify notification method. - [@eltociear](https://github.com/eltociear) for fixing "GitHub" spelling in docs. - [@fqx](https://github.com/fqx) for adding `tailscaled` to apps_groups.conf. - [@k0ste](https://github.com/k0ste) for updating `net`, `aws`, and `ha` groups in apps_groups.conf. - [@kklionz](https://github.com/kklionz) for fixing a compilation warning. - [@olivluca](https://github.com/olivluca) for fixing appending logs to the old log file after logrotate on Debian. - [@petecooper](https://github.com/petecooper) for improving the usage message in netdata-installer. - [@simon300000](https://github.com/simon300000) for adding `caddy` to apps_groups.conf. ## Contributions <a id="v1350-contributions"></a> ### Collectors <a id="v1350-collectors"></a> #### New <a id="v1350-collectors-new"></a> - Add "UPS Load Usage" in Watts chart (charts.d/apcupsd) ([#12965](https://github.com/netdata/netdata/pull/12965), [@ilyam8](https://github.com/ilyam8)) - Add Pressure Stall Information stall time charts (proc.plugin, cgroups.plugin) ([#12869](https://github.com/netdata/netdata/pull/12869), [@ilyam8](https://github.com/ilyam8)) - Add "CPU Time Relative Share" chart when running inside a K8s cluster (cgroups.plugin) ([#12741](https://github.com/netdata/netdata/pull/12741), [@ilyam8](https://github.com/ilyam8)) - Add a collector that parses the log files of the OpenVPN server (go.d/openvpn_status_log) ([#675](https://github.com/netdata/go.d.plugin/pull/675), [@surajnpn](https://github.com/surajnpn)) #### Improvements <a id="v1350-collectors-improvements"></a> ⚙️ Enhancing our collectors to collect all the data you need. <details> <summary>Show 14 more contributions </summary> - Add Tailscale apps_groups.conf (apps.plugin) ([#13033](https://github.com/netdata/netdata/pull/13033), [@fqx](https://github.com/fqx)) - Skip collecting network interface speed and duplex if carrier is down (proc.plugin) ([#13019](https://github.com/netdata/netdata/pull/13019), [@vlvkobal](https://github.com/vlvkobal)) - Run the /net/dev module in a separate thread (proc.plugin) ([#12996](https://github.com/netdata/netdata/pull/12996), [@vlvkobal](https://github.com/vlvkobal)) - Add dictionary support to statsd ([#12980](https://github.com/netdata/netdata/pull/12980), [@ktsaou](https://github.com/ktsaou)) - Add an option to filter the alarms (python.d/alarms) ([#12972](https://github.com/netdata/netdata/pull/12972), [@andrewm4894](https://github.com/andrewm4894)) - Update net, aws, and ha groups in apps_groups.conf (apps.plugin) ([#12921](https://github.com/netdata/netdata/pull/12921), [@k0ste](https://github.com/k0ste)) - Add k8s_cluster_name label to cgroup charts in K8s on GKE (cgroups.plugin) ([#12858](https://github.com/netdata/netdata/pull/12858), [@ilyam8](https://github.com/ilyam8)) - Exclude Proxmox bridge interfaces (proc.plugin) ([#12789](https://github.com/netdata/netdata/pull/12789), [@ilyam8](https://github.com/ilyam8)) - Add filtering by cgroups name and improve renaming in K8s (cgroups.plugin) ([#12778](https://github.com/netdata/netdata/pull/12778), [@ilyam8](https://github.com/ilyam8)) - Execute the renaming script only for containers in K8s (cgroups.plugin) ([#12747](https://github.com/netdata/netdata/pull/12747), [@ilyam8](https://github.com/ilyam8)) - Add k8s_qos_class label to cgroup charts in K8s (cgroups.plugin) ([#12737](https://github.com/netdata/netdata/pull/12737), [@ilyam8](https://github.com/ilyam8)) - Reduce the CPU time required for cgroup-network-helper.sh (cgroups.plugin) ([#12711](https://github.com/netdata/netdata/pull/12711), [@ilyam8](https://github.com/ilyam8)) - Add Proxmox VE processes to apps_groups.conf (apps.plugin) ([#12704](https://github.com/netdata/netdata/pull/12704), [@ilyam8](https://github.com/ilyam8)) - Add Caddy to apps_groups.conf (apps.plugin) ([#12678](https://github.com/netdata/netdata/pull/12678), [@simon300000](https://github.com/simon300000)) </details> #### Bug fixes <a id="v1350-collectors-bug-fixes"></a> 🐞 Improving our collectors one bug fix at a time. <details> <summary>Show 11 more contributions </summary> - Fix adding wrong labels to cgroup charts (cgroups.plugin) ([#13062](https://github.com/netdata/netdata/pull/13062), [@ilyam8](https://github.com/ilyam8)) - Fix cpu_guest chart context (apps.plugin) ([#12983](https://github.com/netdata/netdata/pull/12983), [@ilyam8](https://github.com/ilyam8)) - Fix counting unique values in Sets (statsd.plugin) ([#12963](https://github.com/netdata/netdata/pull/12963), [@ktsaou](https://github.com/ktsaou)) - Fix collecting data from uninitialized containers in K8s (cgroups.plugin) ([#12912](https://github.com/netdata/netdata/pull/12912), [@ilyam8](https://github.com/ilyam8)) - Fix CPU-specific data in the "C-state residency time" chart dimensions (proc.plugin) ([#12898](https://github.com/netdata/netdata/pull/12898), [@vlvkobal](https://github.com/vlvkobal)) - Fix memory usage calculation by considering ZFS ARC as cache on FreeBSD (freebsd.plugin)([#12879](https://github.com/netdata/netdata/pull/12879), [@vlvkobal](https://github.com/vlvkobal)) - Fix disabling K8s pod/container cgroups when fail to rename them (cgroups.plugin) ([#12865](https://github.com/netdata/netdata/pull/12865), [@ilyam8](https://github.com/ilyam8)) - Fix memory usage calculation by considering ZFS ARC as cache on Linux (proc.plugin) ([#12847](https://github.com/netdata/netdata/pull/12847), [@ilyam8](https://github.com/ilyam8)) - Fix adding network interfaces when the cgroup proc is in the host network namespace (cgroups.plugin) ([#12788](https://github.com/netdata/netdata/pull/12788), [@ilyam8](https://github.com/ilyam8)) - Fix not setting chart units (go.d/snmp) ([#682](https://github.com/netdata/go.d.plugin/pull/682), [@ilyam8](https://github.com/ilyam8)) - Fix not collecting Integer type values (go.d/snmp) ([#680](https://github.com/netdata/go.d.plugin/pull/680), [@surajnpn](https://github.com/surajnpn)) </details> ### eBPF <a id="v1350-ebpf"></a> - Add CO-RE algorithms to all threads related to memory ([#12684](https://github.com/netdata/netdata/pull/12684), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix wrong chart type for ip charts ([#12698](https://github.com/netdata/netdata/pull/12698), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix disabled apps (ebpf.plugin) ([#13044](https://github.com/netdata/netdata/pull/13044), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix "libbpf: failed to load" warnings ([#12831](https://github.com/netdata/netdata/pull/12831), [@thiagoftsm](https://github.com/thiagoftsm)) - Re-enable socket module by default ([#12702](https://github.com/netdata/netdata/pull/12702), [@ilyam8](https://github.com/ilyam8)) ### Health <a id="v1350-health"></a> - Fix not respecting host labels when creating alerts for children instances ([#13053](https://github.com/netdata/netdata/pull/13053), [@MrZammler](https://github.com/MrZammler)) - Expose anomaly-bit option to health ([#12835](https://github.com/netdata/netdata/pull/12835), [@vkalintiris](https://github.com/vkalintiris)) - Add priority to Gotify notifications to trigger sound & vibration on the Gotify phone app ([#12753](https://github.com/netdata/netdata/pull/12753), [@JaphethLim](https://github.com/JaphethLim)) - Add Gotify notification method ([#12639](https://github.com/netdata/netdata/pull/12639), [@coffeegrind123](https://github.com/coffeegrind123)) ### Streaming <a id="v1350-streaming"></a> - Improve failover logic when the Agent is configured to stream to multiple destinations ([#12866](https://github.com/netdata/netdata/pull/12866), [@MrZammler](https://github.com/MrZammler)) - Increase the default "buffer size bytes" to 10MB ([#12913](https://github.com/netdata/netdata/pull/12913), [@ilyam8](https://github.com/ilyam8)) ### Exporting <a id="v1350-exporting"></a> - Add the URL query parameter that filters charts from the /allmetrics API query ([#12820](https://github.com/netdata/netdata/pull/12820), [@vlvkobal](https://github.com/vlvkobal)) - Make the "send charts matching" option behave the same as the "filter" URL query parameter for prometheus format ([#12832](https://github.com/netdata/netdata/pull/12832), [@ilyam8](https://github.com/ilyam8)) ### Documentation <a id="v1350-documentantion"></a> 📄 Keeping our documentation healthy together with our awesome community. <details> <summary>Show 11 more contributions </summary> - Add note about Anomaly Advisor ([#13042](https://github.com/netdata/netdata/pull/13042), [@andrewm4894](https://github.com/andrewm4894)) - Add a note on possibly alternate location of the cloud.d directory ([#12987](https://github.com/netdata/netdata/pull/12987), [@cakrit](https://github.com/cakrit)) - Improve instructions on how to reconnect a node to Cloud ([#12891](https://github.com/netdata/netdata/pull/12891), [@cakrit](https://github.com/cakrit)) - Fix unresolved file references ([#12872](https://github.com/netdata/netdata/pull/12872), [@ilyam8](https://github.com/ilyam8)) - Update ML defaults in docs ([#12782](https://github.com/netdata/netdata/pull/12782), [@andrewm4894](https://github.com/andrewm4894)) - Add parent-child configuration examples to ML docs ([#12734](https://github.com/netdata/netdata/pull/12734), [@andrewm4894](https://github.com/andrewm4894)) - Add a note about serial numbers in chart names in the plugins.d API documentation ([#12733](https://github.com/netdata/netdata/pull/12733), [@vlvkobal](https://github.com/vlvkobal)) - Fix a typo in macOS documentation ([#12724](https://github.com/netdata/netdata/pull/12724), [@MrZammler](https://github.com/MrZammler)) - Add a description of interactive/non-interactive modes to the "Uninstall Netdata" doc ([#12687](https://github.com/netdata/netdata/pull/12687), [@odynik](https://github.com/odynik)) - Fix "GitHub" spelling ([#12682](https://github.com/netdata/netdata/pull/12682), [@eltociear](https://github.com/eltociear)) - Add new dashboard/web server reference file ([#11161](https://github.com/netdata/netdata/pull/11161), [@joelhans](https://github.com/joelhans)) </details> ### Packaging / Installation <a id="v1350-packaging-installation"></a> 📦 "Handle with care" - Just like handling physical packages, we put in a lot of care and effort to publish beautiful software packages. <details> <summary>Show 29 more contributions </summary> - Add Alma Linux 9 and RHEL 9 support to CI and packaging ([#13058](https://github.com/netdata/netdata/pull/13058), [@Ferroin](https://github.com/Ferroin)) - Fix handling of temp directory in kickstart when uninstalling ([#13056](https://github.com/netdata/netdata/pull/13056), [@Ferroin](https://github.com/Ferroin)) - Only try to update repo metadata in updater script if needed ([#13009](https://github.com/netdata/netdata/pull/13009), [@Ferroin](https://github.com/Ferroin)) - Use printf instead of echo for printing collected warnings in kickstart ([#13002](https://github.com/netdata/netdata/pull/13002), [@Ferroin](https://github.com/Ferroin)) - Don't kill Netdata PIDs if successfully stopped Netdata in installer/uninstaller ([#12982](https://github.com/netdata/netdata/pull/12982), [@ilyam8](https://github.com/ilyam8)) - Properly handle the case when 'tput colors' does not return a number in kickstart ([#12979](https://github.com/netdata/netdata/pull/12979), [@ilyam8](https://github.com/ilyam8)) - Update libbpf version to v0.8.0 ([#12945](https://github.com/netdata/netdata/pull/12945), [@thiagoftsm](https://github.com/thiagoftsm)) - Update default fping version to 5.1 ([#12930](https://github.com/netdata/netdata/pull/12930), [@ilyam8](https://github.com/ilyam8)) - Update go.d.plugin version to v0.32.3 ([#12862](https://github.com/netdata/netdata/pull/12862), [@ilyam8](https://github.com/ilyam8)) - Autodetect channel for specific version in kickstart ([#12856](https://github.com/netdata/netdata/pull/12856), [@maneamarius](https://github.com/maneamarius)) - Fix "Bad file descriptor" error in netdata-uninstaller ([#12828](https://github.com/netdata/netdata/pull/12828), [@maneamarius](https://github.com/maneamarius)) - Add support for installing static builds on systems without usable internet connections ([#12809](https://github.com/netdata/netdata/pull/12809), [@Ferroin](https://github.com/Ferroin)) - Add --repositories-only option to kickstart ([#12806](https://github.com/netdata/netdata/pull/12806), [@maneamarius](https://github.com/maneamarius)) - Rename --install option for kickstart.sh ([#12798](https://github.com/netdata/netdata/pull/12798), [@maneamarius](https://github.com/maneamarius)) - Fix to avoid recompiling protobuf all the time ([#12790](https://github.com/netdata/netdata/pull/12790), [@ktsaou](https://github.com/ktsaou)) - Fix non-interpreted new lines when printing deferred errors in netdata-installer ([#12786](https://github.com/netdata/netdata/pull/12786), [@ilyam8](https://github.com/ilyam8)) - Fix a typo in the warning() function in netdata-installer ([#12781](https://github.com/netdata/netdata/pull/12781), [@ilyam8](https://github.com/ilyam8)) - Fix checking of environment file in netdata-updater ([#12768](https://github.com/netdata/netdata/pull/12768), [@Ferroin](https://github.com/Ferroin)) - Add a missing function and Alma and Rocky distros as CentOS compatibility distro to netdata-updater ([#12757](https://github.com/netdata/netdata/pull/12757), [@MarianSavchuk](https://github.com/MarianSavchuk)) - Improve the usage message in netdata-installer ([#12755](https://github.com/netdata/netdata/pull/12755), [@petecooper](https://github.com/petecooper)) - Make atomics a hard-dependency ([#12730](https://github.com/netdata/netdata/pull/12730), [@vkalintiris](https://github.com/vkalintiris)) - Add --install-version flag for installing specific Netdata version to kickstart ([#12729](https://github.com/netdata/netdata/pull/12729), [@maneamarius](https://github.com/maneamarius)) - Correctly propagate errors and warnings up to the kickstart script from scripts it calls ([#12686](https://github.com/netdata/netdata/pull/12686), [@Ferroin](https://github.com/Ferroin)) - Fix not-respecting of NETDATA_LISTENER_PORT in docker healthcheck ([#12676](https://github.com/netdata/netdata/pull/12676), [@ilyam8](https://github.com/ilyam8)) - Add options to kickstart for explicitly passing options to installer code ([#12658](https://github.com/netdata/netdata/pull/12658), [@Ferroin](https://github.com/Ferroin)) - Improve handling of release channel selection in kickstart ([#12635](https://github.com/netdata/netdata/pull/12635), [@Ferroin](https://github.com/Ferroin)) - Treat auto-updates as a tristate internally in the kickstart script ([#12634](https://github.com/netdata/netdata/pull/12634), [@Ferroin](https://github.com/Ferroin)) - Include proper package dependency ([#12518](https://github.com/netdata/netdata/pull/12518), [@atriwidada](https://github.com/atriwidada)) - Fix appending logs to the old log file after logrotate on Debian ([#9377](https://github.com/netdata/netdata/pull/9377), [@olivluca](https://github.com/olivluca)) </details> ### Other Notable Changes <a id="v1350-notable"></a> #### Improvements <a id="v1350-notable-improvements"></a> ⚙️ Greasing the gears to smoothen your experience with Netdata. <details> <summary>Show 43 more contributions </summary> - Add hostname to mirrored hosts int the /api/v1/info endpoint ([#13030](https://github.com/netdata/netdata/pull/13030), [@ktsaou](https://github.com/ktsaou)) - Optimize query engine queries ([#12988](https://github.com/netdata/netdata/pull/12988), [@ktsaou](https://github.com/ktsaou)) - Optimize query engine and cleanup ([#12978](https://github.com/netdata/netdata/pull/12978), [@ktsaou](https://github.com/ktsaou)) - Improve the web server work distribution across worker threads ([#12975](https://github.com/netdata/netdata/pull/12975), [@ktsaou](https://github.com/ktsaou)) - Check link local address before querying cloud instance metadata ([#12973](https://github.com/netdata/netdata/pull/12973), [@ilyam8](https://github.com/ilyam8)) - Speed up query engine by refactoring rrdeng_load_metric_next() ([#12966](https://github.com/netdata/netdata/pull/12966), [@ktsaou](https://github.com/ktsaou)) - Optimize the dimensions option store to the metadata database ([#12952](https://github.com/netdata/netdata/pull/12952), [@stelfrag](https://github.com/stelfrag)) - Add detailed dbengine stats ([#12948](https://github.com/netdata/netdata/pull/12948), [@ktsaou](https://github.com/ktsaou)) - Stream Metric Correlation version to parent and advertise Metric Correlation status to the Cloud ([#12940](https://github.com/netdata/netdata/pull/12940), [@MrZammler](https://github.com/MrZammler)) - Move directories, logs, and environment variables configuration options to separate sections ([#12935](https://github.com/netdata/netdata/pull/12935), [@ilyam8](https://github.com/ilyam8)) - Adjust the dimension liveness status check ([#12933](https://github.com/netdata/netdata/pull/12933), [@stelfrag](https://github.com/stelfrag)) - Make sqlite PRAGMAs user configurable ([#12917](https://github.com/netdata/netdata/pull/12917), [@ktsaou](https://github.com/ktsaou)) - Add worker jobs for cgroup-rename, cgroup-network and cgroup-first-time ([#12910](https://github.com/netdata/netdata/pull/12910), [@ktsaou](https://github.com/ktsaou)) - Return stable or nightly based on version if the file check fails ([#12894](https://github.com/netdata/netdata/pull/12894), [@stelfrag](https://github.com/stelfrag)) - Take into account the in queue wait time when executing a data query ([#12885](https://github.com/netdata/netdata/pull/12885), [@stelfrag](https://github.com/stelfrag)) - Add fixes and improvements to workers library ([#12863](https://github.com/netdata/netdata/pull/12863), [@ktsaou](https://github.com/ktsaou)) - Pause alert pushes to the cloud ([#12852](https://github.com/netdata/netdata/pull/12852), [@MrZammler](https://github.com/MrZammler)) - Allow to use the new MQTT 5 implementation ([#12838](https://github.com/netdata/netdata/pull/12838), [@underhood](https://github.com/underhood)) - Set a page wait timeout and retry count ([#12836](https://github.com/netdata/netdata/pull/12836), [@stelfrag](https://github.com/stelfrag)) - Allow external plugins to create chart labels ([#12834](https://github.com/netdata/netdata/pull/12834), [@ilyam8](https://github.com/ilyam8)) - Reduce the number of messages written in the error log due to out of bound timestamps ([#12829](https://github.com/netdata/netdata/pull/12829), [@stelfrag](https://github.com/stelfrag)) - Cleanup the node instance table on startup ([#12825](https://github.com/netdata/netdata/pull/12825), [@stelfrag](https://github.com/stelfrag)) - Accept a data query timeout parameter from the cloud ([#12823](https://github.com/netdata/netdata/pull/12823), [@stelfrag](https://github.com/stelfrag)) - Write the entire request with parameters in the access.log file ([#12815](https://github.com/netdata/netdata/pull/12815), [@stelfrag](https://github.com/stelfrag)) - Add a parameter for how many worker threads the libuv library needs to pre-initialize ([#12814](https://github.com/netdata/netdata/pull/12814), [@stelfrag](https://github.com/stelfrag)) - Optimize linking of foreach alarms to dimensions ([#12813](https://github.com/netdata/netdata/pull/12813), [@vkalintiris](https://github.com/vkalintiris)) - Add a hyphen to the list of available characters for chart names ([#12812](https://github.com/netdata/netdata/pull/12812), [@ilyam8](https://github.com/ilyam8)) - Speed up queries by providing optimization in the main loop ([#12811](https://github.com/netdata/netdata/pull/12811), [@ktsaou](https://github.com/ktsaou)) - Add workers utilization charts for Netdata components ([#12807](https://github.com/netdata/netdata/pull/12807), [@ktsaou](https://github.com/ktsaou)) - Fill missing removed events after a crash ([#12803](https://github.com/netdata/netdata/pull/12803) , [@MrZammler](https://github.com/MrZammler)) - Speed up buffer increases (minimize reallocs) ([#12792](https://github.com/netdata/netdata/pull/12792), [@ktsaou](https://github.com/ktsaou)) - Speed up reading big proc files ([#12791](https://github.com/netdata/netdata/pull/12791), [@ktsaou](https://github.com/ktsaou)) - Make dbengine page cache undumpable and dedupuble ([#12765](https://github.com/netdata/netdata/pull/12765), [@ilyam8](https://github.com/ilyam8)) - Speed up execution of external programs ([#12759](https://github.com/netdata/netdata/pull/12759), [@ktsaou](https://github.com/ktsaou)) - Remove per chart configuration ([#12728](https://github.com/netdata/netdata/pull/12728), [@vkalintiris](https://github.com/vkalintiris)) - Check for chart obsoletion on children re-connections ([#12707](https://github.com/netdata/netdata/pull/12707), [@MrZammler](https://github.com/MrZammler)) - Add a 2 minute timeout to stream receiver socket ([#12673](https://github.com/netdata/netdata/pull/12673), [@MrZammler](https://github.com/MrZammler)) - Improve Agent cloud chart synchronization ([#12655](https://github.com/netdata/netdata/pull/12655), [@stelfrag](https://github.com/stelfrag)) - Add the ability to perform a data query using an offline node id ([#12650](https://github.com/netdata/netdata/pull/12650), [@stelfrag](https://github.com/stelfrag)) - Implement ks_2samp test for Metric Correlations ([#12582](https://github.com/netdata/netdata/pull/12582), [@MrZammler](https://github.com/MrZammler)) - Reduce alert events sent to the cloud ([#12544](https://github.com/netdata/netdata/pull/12544), [@MrZammler](https://github.com/MrZammler)) - Store alert log entries even if alert it is repeating ([#12226](https://github.com/netdata/netdata/pull/12226), [@MrZammler](https://github.com/MrZammler)) - Improve storage number unpacking by using a lookup table ([#11048](https://github.com/netdata/netdata/pull/11048), [@vkalintiris](https://github.com/vkalintiris)) </details> #### Bug fixes <a id="v1350-notable-bug-fixes"></a> 🐞 Increasing Netdata's reliability one bug fix at a time. <details> <summary>Show 33 more contributions </summary> - Fix locking access to chart labels ([#13064](https://github.com/netdata/netdata/pull/13064), [@stelfrag](https://github.com/stelfrag)) - Fix coverity 378625 ([#13055](https://github.com/netdata/netdata/pull/13055), [@MrZammler](https://github.com/MrZammler)) - Fix dictionary crash walkthrough empty ([#13051](https://github.com/netdata/netdata/pull/13051), [@ktsaou](https://github.com/ktsaou)) - Fix the retry count and netdata_exit check when running a sqlite3_step command ([#13040](https://github.com/netdata/netdata/pull/13040), [@stelfrag](https://github.com/stelfrag)) - Fix sending first time seen dimensions with zero timestamp to the Cloud ([#13035](https://github.com/netdata/netdata/pull/13035), [@stelfrag](https://github.com/stelfrag)) - Fix gap filling on dbengine gaps ([#13027](https://github.com/netdata/netdata/pull/13027), [@ktsaou](https://github.com/ktsaou)) - Fix coverity issue 378598 ([#13022](https://github.com/netdata/netdata/pull/13022), [@MrZammler](https://github.com/MrZammler)) - Fix coverity issue 378617,378615 ([#13021](https://github.com/netdata/netdata/pull/13021), [@stelfrag](https://github.com/stelfrag)) - Fix a dimension 100% anomaly rate despite no change in the metric value ([#13005](https://github.com/netdata/netdata/pull/13005), [@vkalintiris](https://github.com/vkalintiris)) - Fix compilation warnings ([#12993](https://github.com/netdata/netdata/pull/12993), [@vlvkobal](https://github.com/vlvkobal)) - Fix crash because of corrupted label message from streaming ([#12992](https://github.com/netdata/netdata/pull/12992), [@MrZammler](https://github.com/MrZammler)) - Fix nanosleep on platforms other than Linux ([#12991](https://github.com/netdata/netdata/pull/12991), [@vlvkobal](https://github.com/vlvkobal)) - Fix disabling a streaming destination because of denied access ([#12971](https://github.com/netdata/netdata/pull/12971), [@MrZammler](https://github.com/MrZammler)) - Fix "unused variable" compilation warning ([#12969](https://github.com/netdata/netdata/pull/12969), [@kklionz](https://github.com/kklionz)) - Fix virtualization detection on FreeBSD ([#12964](https://github.com/netdata/netdata/pull/12964), [@ilyam8](https://github.com/ilyam8)) - Fix buffer overflow when logging "command_to_be_logged" in analytics ([#12947](https://github.com/netdata/netdata/pull/12947), [@MrZammler](https://github.com/MrZammler)) - Fix "global statistics" section in netdata.conf ([#12916](https://github.com/netdata/netdata/pull/12916), [@ilyam8](https://github.com/ilyam8)) - Fix virtualization detection when systemd-detect-virt is not available ([#12911](https://github.com/netdata/netdata/pull/12911), [@ilyam8](https://github.com/ilyam8)) - Fix the log entry for incoming cloud start streaming commands ([#12908](https://github.com/netdata/netdata/pull/12908), [@stelfrag](https://github.com/stelfrag)) - Fix release channel in the node info message ([#12905](https://github.com/netdata/netdata/pull/12905), [@stelfrag](https://github.com/stelfrag)) - Fix alarms count in /api/v1/alarm_count ([#12896](https://github.com/netdata/netdata/pull/12896), [@MrZammler](https://github.com/MrZammler)) - Fix compilation warnings in FreeBSD ([#12887](https://github.com/netdata/netdata/pull/12887), [@vlvkobal](https://github.com/vlvkobal)) - Fix multihost queries alignment ([#12870](https://github.com/netdata/netdata/pull/12870), [@stelfrag](https://github.com/stelfrag)) - Fix negative worker jobs busy time ([#12867](https://github.com/netdata/netdata/pull/12867), [@ktsaou](https://github.com/ktsaou)) - Fix reported by coverity issues related to memory and structure dereference ([#12846](https://github.com/netdata/netdata/pull/12846), [@stelfrag](https://github.com/stelfrag)) - Fix memory leaks and mismatches of the use of the z functions for allocations ([#12841](https://github.com/netdata/netdata/pull/12841), [@ktsaou](https://github.com/ktsaou)) - Fix using obsolete charts/dims in prediction thread ([#12833](https://github.com/netdata/netdata/pull/12833), [@vkalintiris](https://github.com/vkalintiris)) - Fix not skipping ACLK dimension update when dimension is freed ([#12777](https://github.com/netdata/netdata/pull/12777), [@stelfrag](https://github.com/stelfrag)) - Fix coverity warning about not checking return value in receiver setsockopt ([#12772](https://github.com/netdata/netdata/pull/12772), [@MrZammler](https://github.com/MrZammler)) - Fix disk size calculation on macOS ([#12764](https://github.com/netdata/netdata/pull/12764), [@ilyam8](https://github.com/ilyam8)) - Fix "implicit declaration of function" compilation warning ([#12756](https://github.com/netdata/netdata/pull/12756), [@ilyam8](https://github.com/ilyam8)) - Fix Valgrind errors ([#12619](https://github.com/netdata/netdata/pull/12619), [@vlvkobal](https://github.com/vlvkobal)) - Fix redirecting alert emails for a child to the parent ([#12609](https://github.com/netdata/netdata/pull/12609), [@MrZammler](https://github.com/MrZammler)) </details> #### Code organization <a id="v1350-notable-code-organization"></a> 🏋️ Changes to keep our code base in good shape. <details> <summary>Show 48 more contributions </summary> - Update default value for "host anomaly rate threshold" ([#13075](https://github.com/netdata/netdata/pull/13075), [@shyamvalsan](https://github.com/shyamvalsan)) - Initialize chart label key parameter correctly ([#13061](https://github.com/netdata/netdata/pull/13061), [@stelfrag](https://github.com/stelfrag)) - Add the ability to merge dictionary items ([#13054](https://github.com/netdata/netdata/pull/13054), [@ktsaou](https://github.com/ktsaou)) - Dictionary improvements ([#13052](https://github.com/netdata/netdata/pull/13052), [@ktsaou](https://github.com/ktsaou)) - Coverity fixes about statsd; removal of strsame ([#13049](https://github.com/netdata/netdata/pull/13049), [@ktsaou](https://github.com/ktsaou)) - Replace `history` with relevant `dbengine` params ([#13041](https://github.com/netdata/netdata/pull/13041), [@andrewm4894](https://github.com/andrewm4894)) - Schedule retention message calculation to a worker thread ([#13039](https://github.com/netdata/netdata/pull/13039), [@stelfrag](https://github.com/stelfrag)) - Check return value and log an error on failure ([#13037](https://github.com/netdata/netdata/pull/13037), [@stelfrag](https://github.com/stelfrag)) - Add additional metadata to the data response ([#13036](https://github.com/netdata/netdata/pull/13036), [@stelfrag](https://github.com/stelfrag)) - Dictionary with JudyHS and double linked list ([#13032](https://github.com/netdata/netdata/pull/13032), [@ktsaou](https://github.com/ktsaou)) - Initialize a pointer and add a check for it ([#13023](https://github.com/netdata/netdata/pull/13023), [@vlvkobal](https://github.com/vlvkobal)) - Autodetect coverity install path to increase robustness ([#12995](https://github.com/netdata/netdata/pull/12995), [@maneamarius](https://github.com/maneamarius)) - Don't expose the chart definition to streaming if there is no metadata change ([#12990](https://github.com/netdata/netdata/pull/12990), [@stelfrag](https://github.com/stelfrag)) - Make heartbeat a static chart ([#12986](https://github.com/netdata/netdata/pull/12986), [@MrZammler](https://github.com/MrZammler)) - Return rc->last_update from alarms_values api ([#12968](https://github.com/netdata/netdata/pull/12968), [@MrZammler](https://github.com/MrZammler)) - Suppress warning when freeing a NULL pointer in onewayalloc_freez ([#12955](https://github.com/netdata/netdata/pull/12955), [@stelfrag](https://github.com/stelfrag)) - Trigger queue removed alerts on health log exchange with cloud ([#12954](https://github.com/netdata/netdata/pull/12954), [@MrZammler](https://github.com/MrZammler)) - Defer the dimension payload check to the ACLK sync thread ([#12951](https://github.com/netdata/netdata/pull/12951), [@stelfrag](https://github.com/stelfrag)) - Reduce timeout to 1 second for getting cloud instance info ([#12941](https://github.com/netdata/netdata/pull/12941), [@MrZammler](https://github.com/MrZammler)) - Add links to SQLite init options in the src code ([#12920](https://github.com/netdata/netdata/pull/12920), [@ilyam8](https://github.com/ilyam8)) - Remove "enable new cgroups detected at run time" config option ([#12906](https://github.com/netdata/netdata/pull/12906), [@ilyam8](https://github.com/ilyam8)) - Log an error when re-registering an already registered job ([#12903](https://github.com/netdata/netdata/pull/12903), [@ilyam8](https://github.com/ilyam8)) - Use correct identifier when registering the main thread "chart" worker job ([#12902](https://github.com/netdata/netdata/pull/12902), [@ilyam8](https://github.com/ilyam8)) - Change duplicate health template message logging level to 'info' ([#12873](https://github.com/netdata/netdata/pull/12873), [@ilyam8](https://github.com/ilyam8)) - Initialize the metadata database when performing dbengine stress test ([#12861](https://github.com/netdata/netdata/pull/12861), [@stelfrag](https://github.com/stelfrag)) - Add a SQLite database checkpoint command ([#12859](https://github.com/netdata/netdata/pull/12859), [@stelfrag](https://github.com/stelfrag)) - Broadcast completion before unlocking condition variable's mutex ([#12822](https://github.com/netdata/netdata/pull/12822), [@vkalintiris](https://github.com/vkalintiris)) - Switch to mallocz() in onewayallocator ([#12810](https://github.com/netdata/netdata/pull/12810), [@ktsaou](https://github.com/ktsaou)) - Configurable storage engine for Netdata Agents: step 2 ([#12808](https://github.com/netdata/netdata/pull/12808), [@aberaud](https://github.com/aberaud)) - Move kickstart argument parsing code to a function. ([#12805](https://github.com/netdata/netdata/pull/12805), [@Ferroin](https://github.com/Ferroin)) - Remove python.d/* announced in v1.34.0 deprecation notice ([#12796](https://github.com/netdata/netdata/pull/12796), [@ilyam8](https://github.com/ilyam8)) - Don't use MADV_DONTDUMP on non-linux builds ([#12795](https://github.com/netdata/netdata/pull/12795), [@vkalintiris](https://github.com/vkalintiris)) - One way allocator to double the speed of parallel context queries ([#12787](https://github.com/netdata/netdata/pull/12787), [@ktsaou](https://github.com/ktsaou)) - Trace rwlocks of netdata ([#12785](https://github.com/netdata/netdata/pull/12785), [@ktsaou](https://github.com/ktsaou)) - Configurable storage engine for Netdata Agents: step 1 ([#12776](https://github.com/netdata/netdata/pull/12776), [@aberaud](https://github.com/aberaud)) - Some config updates for ML ([#12771](https://github.com/netdata/netdata/pull/12771), [@andrewm4894](https://github.com/andrewm4894)) - Remove node.d.plugin and relevant files ([#12769](https://github.com/netdata/netdata/pull/12769), [@surajnpn](https://github.com/surajnpn)) - Use aclk_parse_otp_error on /env error ([#12767](https://github.com/netdata/netdata/pull/12767), [@underhood](https://github.com/underhood)) - Remove "search for cgroups under PATH" conf option to fix memory leak ([#12752](https://github.com/netdata/netdata/pull/12752), [@ilyam8](https://github.com/ilyam8)) - Remove "enable cgroup X" config option on cgroup deletion ([#12746](https://github.com/netdata/netdata/pull/12746), [@ilyam8](https://github.com/ilyam8)) - Remove undocumented feature reading cgroups-names.sh when renaming cgroups ([#12745](https://github.com/netdata/netdata/pull/12745), [@ilyam8](https://github.com/ilyam8)) - Reduce logging in rrdset ([#12739](https://github.com/netdata/netdata/pull/12739), [@ilyam8](https://github.com/ilyam8)) - Avoid clearing already unset flags. ([#12727](https://github.com/netdata/netdata/pull/12727), [@vkalintiris](https://github.com/vkalintiris)) - Remove commented code ([#12726](https://github.com/netdata/netdata/pull/12726), [@vkalintiris](https://github.com/vkalintiris)) - Remove unused `--auto-update` option when using static/build install method ([#12725](https://github.com/netdata/netdata/pull/12725), [@ilyam8](https://github.com/ilyam8)) - Allocate buffer memory for uv_write and release in the callback function ([#12688](https://github.com/netdata/netdata/pull/12688), [@stelfrag](https://github.com/stelfrag)) - Implements new capability fields in aclk_schemas ([#12602](https://github.com/netdata/netdata/pull/12602), [@underhood](https://github.com/underhood)) - Cleanup Challenge Response Code ([#11730](https://github.com/netdata/netdata/pull/11730), [@underhood](https://github.com/underhood)) </details> ## Deprecation notice <a id="v1350-deprecation-notice"></a> The following items will be removed in our next minor release (v1.36.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |------------------------------------------------------------------------------------------------------------------------|:---------:|:--------------------------------------------------------------------------------------------------------:| | [python.d/chrony](https://github.com/netdata/netdata/tree/v1.34.1/collectors/python.d.plugin/chrony) | collector | [go.d/chrony](https://github.com/netdata/go.d.plugin/tree/master/modules/chrony) | | [python.d/ovpn_status_log](https://github.com/netdata/netdata/tree/v1.34.1/collectors/python.d.plugin/ovpn_status_log) | collector | [go.d/openvpn_status_log](https://github.com/netdata/go.d.plugin/tree/master/modules/openvpn_status_log) | All the deprecated components will be moved to the [netdata/community](https://github.com/netdata/community) repository. ### Deprecated in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/1.34.0#deprecation-notice), the following items have been removed in this release: | Component | Type | Replaced by | |----------------------------------------------------------------------------------------------------------------------|:---------:|:---------------------------------------------------------------------------------------------------------:| | [node.d](https://github.com/netdata/netdata/tree/v1.33.1/collectors/node.d.plugin#nodedplugin) | plugin | - | | [node.d/snmp](https://github.com/netdata/netdata/tree/v1.33.1/collectors/node.d.plugin/snmp) | collector | [go.d/snmp](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/snmp) | | [python.d/apache](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/apache) | collector | [go.d/apache](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache) | | [python.d/couchdb](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/couchdb) | collector | [go.d/couchdb](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/couchdb) | | [python.d/dns_query_time](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/dns_query_time) | collector | [go.d/dnsquery](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/dnsquery) | | [python.d/dnsdist](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/dnsdist) | collector | [go.d/dnsdist](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/dnsdist) | | [python.d/elasticsearch](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/elasticsearch) | collector | [go.d/elasticsearch](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/elasticsearch) | | [python.d/energid](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/energid) | collector | [go.d/energid](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/energid) | | [python.d/freeradius](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/freeradius) | collector | [go.d/freeradius](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/freeradius) | | [python.d/httpcheck](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/httpcheck) | collector | [go.d/httpcheck](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/httpcheck) | | [python.d/isc_dhcpd](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/isc_dhcpd) | collector | [go.d/isc_dhcpd](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/isc_dhcpd) | | [python.d/mysql](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/mysql) | collector | [go.d/mysql](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql) | | [python.d/nginx](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/nginx) | collector | [go.d/nginx](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) | | [python.d/phpfpm](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/phpfpm) | collector | [go.d/phpfpm](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/phpfpm) | | [python.d/portcheck](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/portcheck) | collector | [go.d/portcheck](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/portcheck) | | [python.d/powerdns](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/powerdns) | collector | [go.d/powerdns](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/powerdns) | | [python.d/redis](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/redis) | collector | [go.d/redis](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis) | | [python.d/web_log](https://github.com/netdata/netdata/tree/v1.33.1/collectors/python.d.plugin/web_log) | collector | [go.d/weblog](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog) | ### Platform Support Changes <a id="v1350-platform-support"></a> This release adds official support for the following platforms: - RHEL 9.x, Alma Linux 9.x, and other compatible RHEL 9.x derived platforms - Alpine Linux 3.16 This release removes official support for the following platforms: - Fedora 34 (support ended due to upstream EOL). - Alpine Linux 3.12 (support ended due to upstream EOL). This release includes the following additional platform support changes. - We’ve switched from Alpine 3.15 to Alpine 3.16 as the base for our Docker images and static builds. This should not require any action on the part of users, and simply represents a version bump to the tooling included in our Docker images and static builds. - We’ve switched from Rocky Linux to Alma Linux as our build and test platform for RHEL compatible systems. This will enable us to provide better long-term support for such platforms, as well as opening the possibility of better support for non-x86 systems. ## Netdata Agent Release Meetup <a id="v1350-release-meetup"></a> Join the Netdata team on the **9th of June at 5pm UTC** for the **Netdata Agent Release Meetup**, which will be held on the [Netdata Discord](https://discord.gg/pnpjpwfE?event=983676714062315560). Together we’ll cover: - Release Highlights - Acknowledgements - Q&A with the community [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/286414883/) - we look forward to meeting you. ## Support options <a id="v1350-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1100 engineers are already using it! 2022-06-08T18:51:44+00:00 netdata v1.35.1 netdata v1.35.1 2022-06-10T14:15:47+00:00 Netdata v1.35.1 is a patch release to address issues discovered since v1.35.0. [Refer to the v.1.35.0 release notes](https://github.com/netdata/netdata/releases/tag/v1.35.0) for the full scope of that release. The v1.35.1 patch release fixes an issue in the static build installation code that causes automatic updates to be unintentionally disabled when updating static installs. If you have installed Netdata using a static build since 2022-03-22 and you did not explicitly disable automatic updates, you are probably affected by this bug. For more details, including info on how to re-enable automatic updates if you are affected, refer to [this Github issue](https://github.com/netdata/netdata/issues/13102). ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud/): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [Github Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [Github Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1100 engineers are already using it! 2022-06-10T14:15:47+00:00 netdata v1.36.0 netdata v1.36.0 2022-08-10T20:04:46+00:00 # Release v1.36 **Table of contents** - [Release highlights](#v1360-release-highlights) - [Metric correlations](#v1360-mc) - [New metric correlation algorithm (tech preview)](#v1360-mc-new-algo) - [Cooperation of the Metric Correlations (MC) component with the Anomaly Advisor](#v1360-mc-aa-coop) - [Metric correlations dashboard](#v1360-mc-new-dash) - [What's next with Metric Correlations](#v1360-mc-whats-next) - [Tiering, providing almost unlimited metrics for your nodes](#v1360-tiering) - [Kubernetes](#v1360-kubernetes) - [Anomaly Rate on every chart](#v1360-anomaly-rate-on-charts) - [Centralized Admin Interface & Bulk deletion of offline nodes](#v1360-cai) - [Agent and Cloud chart metadata syncing](#v1360-chart-metadata-syncing) - [Visualization improvements](#v1360-visualization-improvements) - [Labels on every chart](#v1360-labels-on-charts) - [Acknowledgments](#v1360-ack) - [Contributions](#v1360-contributions) - [Deprecation notice](#v1360-deprecation-notice) - [Netdata Release Meetup](#v1360-release-meetup) - [Support options](#v1360-support-options) > ❗ We're keeping our codebase healthy by removing features that are end of life. Read the [deprecation notice](#v1360-deprecation-notice) to check if you are affected. ### Netdata open-source growth - 7.6M+ troubleshooters monitor with Netdata - 1.6M unique nodes currently live - 3.3k+ new nodes per day - Over 557M Docker pulls all-time total - [Over 60,000 stargazers on GitHub](https://twitter.com/linuxnetdata/status/1555555853993971715) ## Release highlights <a id="v1360-release-highlights"></a> ### Metric correlations <a id="v1360-mc"></a> #### New metric correlation algorithm (tech preview) <a id="v1360-mc-new-algo"></a> The Agent's default algorithm to run a metric correlations job (ks2) is based on Kolmogorov-Smirnov test. In this release, we also included the _Volume_ algorithm, which is an heuristic algorithm based on the percentage change in averages between the highlighted window and a baseline, where various edge cases are sensibly controlled. You can explore our implementation in the [Agent's source code](https://github.com/netdata/netdata/blob/d917f9831c0a1638ef4a56580f321eb6c9a88037/database/metric_correlations.c#L516-L620) This algorithm is almost **73 times faster** than the default algorithm (named _ks2_) with near the same accuracy. Give it a try by enabling it by default in your netdata.conf. ``` [global] # enable metric correlations = yes metric correlations method = volume ``` #### Cooperation of the Metric Correlations (MC) component with the Anomaly Advisor <a id="v1360-mc-aa-coop"></a> The Anomaly Advisor feature lets you quickly surface potentially anomalous metrics and charts related to a particular highlight window of interest. When the Agent trains its internal Machine Learning models, it produces an Anomaly Rate for each metric. With this release, Netdata can now perform Metric Correlation jobs based on these Anomalous Rate values for your metrics. #### Metric correlations dashboard <a id="v1360-mc-new-dash"></a> In the past, you used to run MC jobs from the Node's dashboard with all the settings predefined. Now, Netdata gives you some extra functionality to run an MC job for a window of interest with the following options: 1. To run an MC job on both Metrics and their Anomaly Rate 2. To change the aggregation method of datapoints for the metrics. 3. To choose between different algorithms All this from the same, single dashboard. ![Image](https://user-images.githubusercontent.com/88642300/183893817-59a67fa5-652c-47e4-ad70-5ac015e563b7.gif) #### What's next with Metric Correlations <a id="v1360-mc-whats-next"></a> Troubleshooting complicated infrastructures can get increasingly hard, but Netdata wants to continually provide you with the best troubleshooting experience. On that note, here are some next logical steps for for our Metric Correlations feature, planned for upcoming releases: 1. Enriching the Agent with more Metric Correlation algorithms. 2. Making the Metric Correlation component run seamless (you can explore the `/weights` endpoint in the [Agent's API](https://editor.swagger.io/?url=https://raw.githubusercontent.com/netdata/netdata/master/web/api/netdata-swagger.yaml); this is a WIP). 3. Giving you the ability to run Metric Correlation Jobs across multiple nodes. Be on the lookout for these upgrades and feel free to reach us in our channels with your ideas. ### Tiering, providing almost unlimited metrics for your nodes <a id="v1360-tiering"></a> Netdata is a high fidelity monitoring solution. That comes with a cost, the cost of keeping those data in your disks. To help remedy this cost issue, Netdata introduces with this release the _Tiering_ mechanism for the Agent's time-series database ([dbengine](https://learn.netdata.cloud/docs/agent/database/engine)). Tiering is the mechanism of providing multiple tiers of data with different granularity on metrics by doing the following: 1. Downsampling the data into lower resolution data. 2. Keeping statistical information about the metrics to recreate the original* metrics. Visit the [Tiering in a nutshell](https://learn.netdata.cloud/docs/agent/database/engine#tiering-in-a-nutshell) section in our docs to understand the maximum potential of this feature. Also, don't hesitate to enable this feature to [change the retention of your metrics](https://learn.netdata.cloud/docs/store/change-metrics-storage) Note: *Of course the metric may vary; you can just recreate the exact time series without taking into consideration other parameters. ### Kubernetes <a id="v1360-kubernetes"></a> A Kubernetes Cluster can easily have hundreds (or even thousands) of pods running containers. Netdata is now able to provide you with an overview of the workloads and the nodes of your Cluster. Explore the full capabilities of the [k8s_state module](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_state) ### Anomaly Rate on every chart <a id="v1360-anomaly-rate-on-charts"></a> In a previous release, we introduced unsupervised ML & Anomaly Detection in Netdata with [Anomaly Advisor](https://www.netdata.cloud/blog/introducing-anomaly-advisor-unsupervised-anomaly-detection-in-netdata/). With this next step, we’re bringing anomaly rates to every chart in Netdata Cloud. Anomaly information is no longer limited to the Anomalies tab and will be accessible to you from the Overview and Single Node view tabs as well. We hope this will make your troubleshooting journey easier, as you will have the anomaly rates for any metric available with a single click, whichever metric or chart you happen to be exploring at that instant. If you are looking at a particular metric in the overview or single node dashboard and are wondering if the metric was truly anomalous or not, you can now confirm or disprove that feeling by clicking on the anomaly icon and expanding the anomaly rate view. Anomaly rates are calculated per second based on ML models that are trained every hour. ![Metrics Dashboard Anomaly](https://user-images.githubusercontent.com/82235632/182436141-d006e940-7e96-40b1-8cd6-176175d3c465.gif) For more details please check our [blog post and video walkthrough](https://www.netdata.cloud/blog/anomaly-rate-in-every-chart). ### Centralized Admin Interface & Bulk deletion of offline nodes <a id="v1360-cai"></a> We've listened and understood the your pain around Space and War Room settings in Netdata Cloud. In response, we have simplified and organized these settings into a Centralized Administration Interface! In a single place, you're now able to access and change attributes around: - Space - War Rooms - Nodes - Users - Notifications - Bookmarks ![CAI_full](https://user-images.githubusercontent.com/88642300/183901711-69a44acb-4b35-4203-9aee-571c232497a4.gif) Along with this change, the deletion of individual offline nodes has been greatly improved. You can now access the Space settings, and on Nodes within which it is possible to filter all Offline nodes, you can now mass select and bulk delete them. ### Agent and Cloud chart metadata syncing <a id="v1360-chart-metadata-syncing"></a> On this release, we are doing a major improvement on our chart metadata syncing protocol. We moved from a very granular message exchange at [chart dimension](https://learn.netdata.cloud/docs/dashboard/dimensions-contexts-families#dimension) level to a higher level at [context](https://learn.netdata.cloud/docs/dashboard/dimensions-contexts-families#context). This approach will allow us to decrease the complexity and points of failure on this flow, since we reduced the number of events being exchanged and scenarios that need to be dealt with. We will continuously fix complex and hard-to-track existing bugs and any potential unknown ones. This will also bring a lot of benefits to data transfer between Agents to Cloud, since we reduced the number of messages being transmitted. To sum up these changes: 1. The traffic between Netdata cloud and Agents is reduced significantly. 2. Netdata Cloud scales smoother with hundreds of nodes. 3. Netdata Cloud is aware of charts and nodes metadata. ### Visualization improvements <a id="v1360-visualization-improvements"></a> #### Composite chart enhancements We have restructured composite charts into a more natural presentation. You can now read composite charts as if reading a simple sentence, and make better sense of how and what queries are being triggered. In addition to this, we've added additional control over time aggregations. You can now instruct the agent nodes on what type of aggregation you want to apply when multiple points are grouped into a single one. The options available are: min, max, average, sum, incremental sum (delta), standard deviation, coefficient of variation, media, exponential weighted moving average and double exponential smoothing. ![s8ViedR](https://user-images.githubusercontent.com/82235632/182433697-31717d23-588a-481e-96a5-4c6220fe04e7.gif) #### Theme restyling We've also put some effort to improve our light and dark themes. The focus was put on: * optimizing space for the information that is crucial to you when you're exploring and/or troubleshooting your nodes. * improving contrast ratios so that the components and data that are more relevant don't get lost among other _noise_. ![image](https://user-images.githubusercontent.com/82235632/182435316-f901b548-5c99-483c-9c94-1bed1ffe2105.png) ### Labels on every chart <a id="v1360-labels-on-charts"></a> Most of the time, you will group metrics by their dimension or their instance, but there are some benefits to other groupings. So, you can now group them by logical representations. For instance, you can represent the traffic in your network interfaces by their interface type, virtual or physical. ![Group By Options](https://user-images.githubusercontent.com/88642300/183892566-10a93d6a-84e9-4829-8dec-2ac7fe91b029.gif) This is still a work in progress, but you can explore the newly added labels on the following areas/charts: - Disks - Mountpoints in your system - Network interfaces both wired and wireless - MD arrays - Power supply units - Filesystem (like BTRFS) ## Acknowledgments <a id="v1360-ack"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer is essential to our success. We thank you and look forward to continue to grow together to build a remarkable product. - [@didier13150](https://github.com/didier13150) for fixing boolean value for ProtectControlGroups in the systemd unit file. - [@kklionz](https://github.com/kklionz) for fixing a base64_encode bug in Exporting Engine. - [@kralewitz](https://github.com/kralewitz) for fix parsing multiple values in nginx upstream_response_time in go.d/web_log. - [@mhkarimi1383](https://github.com/mhkarimi1383) for adding an alternative way to get ansible plays to Ansible quickstart. - [@tnyeanderson](https://github.com/tnyeanderson) for fixing netdata-updater.sh sha256sum on BSDs. - [@xkisu](https://github.com/xkisu) for fixing cgroup name detection for docker containers in containerd cgroup. - [@boxjan](https://github.com/boxjan) for adding Chrony collector. ## Contributions <a id="v1360-contributions"></a> ### Collectors #### New ⚙️ Enhancing our collectors to collect all the data you need. - Add PgBouncer collector (go.d/pgbouncer) ([#748](https://github.com/netdata/go.d.plugin/pull/748), [@ilyam8](https://github.com/ilyam8)) - Add WireGuard collector (go.d/wireguard) ([#744](https://github.com/netdata/go.d.plugin/pull/744), [@ilyam8](https://github.com/ilyam8)) - Add PostgresSQL collector (go.d/postgres) ([#718](https://github.com/netdata/go.d.plugin/pull/744), [@ilyam8](https://github.com/ilyam8)) - Add Chrony collector (go.d/chrony) ([#678](https://github.com/netdata/go.d.plugin/pull/678), [@boxjan](https://github.com/boxjan)) - Add Kubernetes State collector (go.d/k8s_state) ([#673](https://github.com/netdata/go.d.plugin/pull/673), [@ilyam8](https://github.com/ilyam8)) #### Improvements ⚙️ Enhancing our collectors to collect all the data you need. <details> <summary> Show 20 more contributions </summary> - Add WireGuard description and icon to dashboard info ([#13483](https://github.com/netdata/netdata/pull/13483), [@ilyam8](https://github.com/ilyam8)) - Resolve nomad containers name (cgroups.plugin) ([#13481](https://github.com/netdata/netdata/pull/13481), [@ilyam8](https://github.com/ilyam8)) - Update postgres dashboard info ([#13474](https://github.com/netdata/netdata/pull/13474), [@ilyam8](https://github.com/ilyam8)) - Improve Chrony dashboard info ([#13371](https://github.com/netdata/netdata/pull/13371), [@ilyam8](https://github.com/ilyam8)) - Improve config file parsing error message (python.d) ([#13363](https://github.com/netdata/netdata/pull/13363), [@ilyam8](https://github.com/ilyam8)) - Rename the chart of real memory usage in FreeBSD (freebsd.plugin) ([#13271](https://github.com/netdata/netdata/pull/13271), [@vlvkobal](https://github.com/vlvkobal)) - Add fstype label to disk charts (diskspace.plugin) ([#13245](https://github.com/netdata/netdata/pull/13245), [@vlvkobal](https://github.com/vlvkobal)) - Add support for loadin modules from user plugin directories (python.d) ([#13214](https://github.com/netdata/netdata/pull/13214), [@ilyam8](https://github.com/ilyam8)) - Add user plugin dirs to environment variables ([#13203](https://github.com/netdata/netdata/pull/13203), [@vlvkobal](https://github.com/vlvkobal)) - Add second data collection job that tries to read from '/var/lib/smartmontools/' (python.d/smartd) ([#13188](https://github.com/netdata/netdata/pull/13188), [@ilyam8](https://github.com/ilyam8)) - Add type label for network interfaces (proc.plugin) ([#13187](https://github.com/netdata/netdata/pull/13187), [@vlvkobal](https://github.com/vlvkobal)) - Add k8s_state dashboard_info ([#13181](https://github.com/netdata/netdata/pull/13181), [@ilyam8](https://github.com/ilyam8)) - Add dimension per physical link state to the "Interface Physical Link State" chart (proc.plugin) ([#13176](https://github.com/netdata/netdata/pull/13176), [@ilyam8](https://github.com/ilyam8)) - Add dimension per operational state to the "Interface Operational State" chart (proc.plugin) ([#13167](https://github.com/netdata/netdata/pull/13167), [@ilyam8](https://github.com/ilyam8)) - Add dimension per duplex state to the "Interface Duplex State" chart (proc.plugin) ([#13165](https://github.com/netdata/netdata/pull/13165), [@ilyam8](https://github.com/ilyam8)) - Add cargo/rustc/bazel/buck to apps_groups.conf (apps.plugin) ([#13143](https://github.com/netdata/netdata/pull/13143), [@vkalintiris](https://github.com/vkalintiris)) - Add Memory Available chart to FreeBSD (freebsd.plugin) ([#13140](https://github.com/netdata/netdata/pull/13140), [@MrZammler](https://github.com/MrZammler)) - Add a separate thread for slow mountpoints in the diskspace plugin (diskspace.plugin) ([#13067](https://github.com/netdata/netdata/pull/13067), [@vlvkobal](https://github.com/vlvkobal)) - Add simple dimension algorithm guess logic when algorithm is not set (go.d/snmp) ([#737](https://github.com/netdata/go.d.plugin/pull/737), [@ilyam8](https://github.com/ilyam8)) - Add common stub_status locations (go.d/nginx) ([#702](https://github.com/netdata/go.d.plugin/pull/702), [@cpipilas](https://github.com/cpipilas)) </details> #### Bug fixes 🐞 Improving our collectors one bug fix at a time. <details> <summary> Show 17 more contributions </summary> - Fix cgroup name detection for docker containers in containerd cgroup (cgroups.plugin) ([#13470](https://github.com/netdata/netdata/pull/13470), [@xkisu](https://github.com/xkisu)) - Fix not handling log rotation (python.d/smartd) ([#13460](https://github.com/netdata/netdata/pull/13460), [@ilyam8](https://github.com/ilyam8)) - Fix kubepods patterns to filter pods when using Kind cluster (cgroups.plugin) ([#13324](https://github.com/netdata/netdata/pull/13324), [@ilyam8](https://github.com/ilyam8)) - Fix 'zmstat*' pattern to exclude zoneminder scripts (apps.plugin) ([#13314](https://github.com/netdata/netdata/pull/13314), [@ilyam8](https://github.com/ilyam8)) - Fix kubepods name resolution in a kind cluster (cgroups.plugin) ([#13302](https://github.com/netdata/netdata/pull/13302), [@ilyam8](https://github.com/ilyam8)) - Fix extensive error logging (cgroups.plugin) ([#13274](https://github.com/netdata/netdata/pull/13274), [@vlvkobal](https://github.com/vlvkobal)) - Fix qemu VMs and LXC containers name resolution (cgroups.plugin) ([#13220](https://github.com/netdata/netdata/pull/13220), [@ilyam8](https://github.com/ilyam8)) - Fix duplicate mountinfo (proc.plugin) ([#13215](https://github.com/netdata/netdata/pull/13215), [@ktsaou](https://github.com/ktsaou)) - Fix removing netdev chart labels (cgroups.plugin) ([#13200](https://github.com/netdata/netdata/pull/13200), [@vlvkobal](https://github.com/vlvkobal)) - Fix wired/cached/avail memory calculation on FreeBSD with ZFS (freebsd.plugin) ([#13183](https://github.com/netdata/netdata/pull/13183), [@ilyam8](https://github.com/ilyam8)) - Fix import collection for py3.10+ (python.d) ([#13136](https://github.com/netdata/netdata/pull/13136), [@ilyam8](https://github.com/ilyam8)) - Fix not setting connection timeout for pymongo4+ (python.d/mongodb) ([#13135](https://github.com/netdata/netdata/pull/13135), [@ilyam8](https://github.com/ilyam8)) - Fix not handling slow setting spec.NodeName for Pods (go.d/k8s_state) ([#717](https://github.com/netdata/go.d.plugin/pull/717), [@ilyam8](https://github.com/ilyam8)) - Fix empty charts when ServerMPM is prefork ([#715](https://github.com/netdata/go.d.plugin/pull/715), [@ilyam8](https://github.com/ilyam8)) - Fix parsing multiple values in nginx upstream_response_time (go.d/web_log) ([#711](https://github.com/netdata/go.d.plugin/pull/711), [@kralewitz](https://github.com/kralewitz)) - Fix collecting metrics for Nodes with dots in name (go.d/k8s_state) ([#710](https://github.com/netdata/go.d.plugin/pull/710), [@ilyam8](https://github.com/ilyam8)) - Fix adding dimensions to User CPU Time chart at runtime (go.d/mysql) ([#689](https://github.com/netdata/go.d.plugin/pull/689), [@ilyam8](https://github.com/ilyam8)) </details> ### eBPF - Fix data collection frequency ([#13351](https://github.com/netdata/netdata/pull/13351), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix crash on cleanup ([#13259](https://github.com/netdata/netdata/pull/13259), [@thiagoftsm](https://github.com/thiagoftsm)) ### Exporting <details> <summary> Show 6 more contributions </summary> - Fix a base64_encode bug ([#13074](https://github.com/netdata/netdata/pull/13074), [@kklionz](https://github.com/kklionz)) - Fix sent metrics calculation ([#13435](https://github.com/netdata/netdata/pull/13435), [@vlvkobal](https://github.com/vlvkobal)) - Move host tags to netdata_info ([#13358](https://github.com/netdata/netdata/pull/13358), [@vlvkobal](https://github.com/vlvkobal)) - Fix exporting to OpenTSDB ([#13355](https://github.com/netdata/netdata/pull/13355), [@vlvkobal](https://github.com/vlvkobal)) - Fix exporting to Graphite ([#13261](https://github.com/netdata/netdata/pull/13261), [@vlvkobal](https://github.com/vlvkobal)) - Add exporting chart variables ([#13221](https://github.com/netdata/netdata/pull/13221), [@boxjan](https://github.com/boxjan)) </details> ### Documentation 📄 Keeping our documentation healthy together with our awesome community. <details> <summary> Show 23 more contributions </summary> - Add a note about network interface monitoring when running in a Docker container ([#13458](https://github.com/netdata/netdata/pull/13458), [@ilyam8](https://github.com/ilyam8)) - Fix Anomaly Detection guide, so we can reference its subsections ([#13455](https://github.com/netdata/netdata/pull/13455), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix a typo in PostgreSQL section header ([#13440](https://github.com/netdata/netdata/pull/13440), [@shyamvalsan](https://github.com/shyamvalsan)) - Add Discord, YouTube, LinkedIn links to README ([#13419](https://github.com/netdata/netdata/pull/13419), [@andrewm4894](https://github.com/andrewm4894)) - Add ML bullet point to features section on README ([#13418](https://github.com/netdata/netdata/pull/13418), [@andrewm4894](https://github.com/andrewm4894)) - Fix docs metadata fields ([#13406](https://github.com/netdata/netdata/pull/13406), [@tkatsoulas](https://github.com/tkatsoulas)) - Clarify python.d haproxy module readme ([#13388](https://github.com/netdata/netdata/pull/13388), [@ilyam8](https://github.com/ilyam8)) - Add missing openSUSE 15.4 to platform support list. ([#13373](https://github.com/netdata/netdata/pull/13373), [@Ferroin](https://github.com/Ferroin)) - Add another way to get ansible plays to Ansible quickstart ([#13349](https://github.com/netdata/netdata/pull/13349), [@mhkarimi1383](https://github.com/mhkarimi1383)) - Add GitHub stars badge to readme ([#13338](https://github.com/netdata/netdata/pull/13338), [@andrewm4894](https://github.com/andrewm4894)) - Explain new tiering mechanism in metric storage docs ([#13327](https://github.com/netdata/netdata/pull/13327), [@tkatsoulas](https://github.com/tkatsoulas)) - Add link to docker config section ([#13323](https://github.com/netdata/netdata/pull/13323) , [@cakrit](https://github.com/cakrit)) - Add a guide for troubleshooting Agent with Cloud connection for new nodes ([#13322](https://github.com/netdata/netdata/pull/13322), [@Ancairon](https://github.com/Ancairon)) - Update External Plugins API doc ([#13273](https://github.com/netdata/netdata/pull/13273), [@thiagoftsm](https://github.com/thiagoftsm)) - Update REST API documentation. ([#13269](https://github.com/netdata/netdata/pull/13269), [@Ferroin](https://github.com/Ferroin)) - Add document explaining how to proxy Netdata via H2O ([#13266](https://github.com/netdata/netdata/pull/13266), [@Ferroin](https://github.com/Ferroin)) - Improve anomaly detection guide ([#13238](https://github.com/netdata/netdata/pull/13238), [@andrewm4894](https://github.com/andrewm4894)) - Improve configuration example in ML readme ([#13182](https://github.com/netdata/netdata/pull/13182), [@andrewm4894](https://github.com/andrewm4894)) - Docs housekeeping ([#13179](https://github.com/netdata/netdata/pull/13179), [@tkatsoulas](https://github.com/tkatsoulas)) - Add ML alerts examples ([#13173](https://github.com/netdata/netdata/pull/13173), [@andrewm4894](https://github.com/andrewm4894)) - Improve "if collector not there" section in Collectors readme ([#13152](https://github.com/netdata/netdata/pull/13152), [@cakrit](https://github.com/cakrit)) - Fix indentation in StatsD readme ([#13096](https://github.com/netdata/netdata/pull/13096), [@ilyam8](https://github.com/ilyam8)) - Add missing commands to daemon readme ([#13080](https://github.com/netdata/netdata/pull/13080), [@tkatsoulas](https://github.com/tkatsoulas)) </details> ### Packaging / Installation 📦 "Handle with care" - Just like handling physical packages, we put in a lot of care and effort to publish beautiful software packages. <details> <summary> Show 25 more contributions </summary> - Update go.d.plugin version to v0.34.0 ([#13484](https://github.com/netdata/netdata/pull/13484), [@ilyam8](https://github.com/ilyam8)) - Fix netdata-updater.sh sha256sum on BSDs ([#13391](https://github.com/netdata/netdata/pull/13391), [@tnyeanderson](https://github.com/tnyeanderson)) - Add Oracle Linux 9 to officially supported platforms ([#13367](https://github.com/netdata/netdata/pull/13367), [@Ferroin](https://github.com/Ferroin)) - Vendor Judy ([#13362](https://github.com/netdata/netdata/pull/13362), [@underhood](https://github.com/underhood)) - Add additional Docker image build with debug info included ([#13359](https://github.com/netdata/netdata/pull/13359), [@Ferroin](https://github.com/Ferroin)) - Fix not respecting CFLAGS arg when building Docker image ([#13340](https://github.com/netdata/netdata/pull/13340), [@ilyam8](https://github.com/ilyam8)) - Remove python-mysql from install-required-packages.sh ([#13288](https://github.com/netdata/netdata/pull/13288), [@ilyam8](https://github.com/ilyam8)) - Remove obsolete --use-system-lws option from netdata-installer.sh help ([#13272](https://github.com/netdata/netdata/pull/13272), [@Dim-P](https://github.com/Dim-P)) - Fix issues with DEB postinstall script ([#13252](https://github.com/netdata/netdata/pull/13252), [@Ferroin](https://github.com/Ferroin)) - Don’t pull in GCC for build if Clang is already present. ([#13244](https://github.com/netdata/netdata/pull/13244), [@Ferroin](https://github.com/Ferroin)) - Upload packages to new self-hosted repository infrastructure ([#13240](https://github.com/netdata/netdata/pull/13240), [@Ferroin](https://github.com/Ferroin)) - Bump repoconfig package version used in kickstart.sh ([#13235](https://github.com/netdata/netdata/pull/13235), [@Ferroin](https://github.com/Ferroin)) - Properly handle interactivity in the updater code ([#13209](https://github.com/netdata/netdata/pull/13209), [@Ferroin](https://github.com/Ferroin)) - Don’t use realpath to find kickstart source path ([#13208](https://github.com/netdata/netdata/pull/13208), [@Ferroin](https://github.com/Ferroin)) - Ensure tmpdir is set for every function that uses it ([#13206](https://github.com/netdata/netdata/pull/13206), [@Ferroin](https://github.com/Ferroin)) - Add netdata user to secondary group in RPM package ([#13197](https://github.com/netdata/netdata/pull/13197), [@iigorkarpov](https://github.com/iigorkarpov)) - Remove a call to 'cleanup_old_netdata_updater()' because it is no longer exists ([#13189](https://github.com/netdata/netdata/pull/13189), [@ilyam8](https://github.com/ilyam8)) - Don’t manipulate positional parameters in DEB postinst script ([#13169](https://github.com/netdata/netdata/pull/13169), [@Ferroin](https://github.com/Ferroin)) - Add CAP_SYS_RAWIO to Netdata's systemd unit CapabilityBoundingSet ([#13154](https://github.com/netdata/netdata/pull/13154), [@ilyam8](https://github.com/ilyam8)) - Add netdata user to secondary group in DEB package ([#13109](https://github.com/netdata/netdata/pull/13109), [@iigorkarpov](https://github.com/iigorkarpov)) - Fix updating when using `--force-update` and new version of the updater script is available ([#13104](https://github.com/netdata/netdata/pull/13104), [@ilyam8](https://github.com/ilyam8)) - Remove unnecessary ‘cleanup’ code ([#13103](https://github.com/netdata/netdata/pull/13103), [@Ferroin](https://github.com/Ferroin)) - Remove official support for Debian 9. ([#13065](https://github.com/netdata/netdata/pull/13065), [@Ferroin](https://github.com/Ferroin)) - Add openSUSE Leap 15.4 to CI and package builds. ([#12270](https://github.com/netdata/netdata/pull/12270), [@Ferroin](https://github.com/Ferroin)) - Fix boolean value for ProtectControlGroups in the systemd unit file ([#11281](https://github.com/netdata/netdata/pull/11281), [@didier13150](https://github.com/didier13150)) </details> ### Other Notable Changes #### Improvements ⚙️ Greasing the gears to smoothen your experience with Netdata. <details> <summary> Show 19 more contributions </summary> - Enable rrdcontexts by default ([#13471](https://github.com/netdata/netdata/pull/13471), [@stelfrag](https://github.com/stelfrag)) - Add rrdcontext support for hidden charts ([#13466](https://github.com/netdata/netdata/pull/13466), [@ktsaou](https://github.com/ktsaou)) - Load host labels for archived hosts ([#13464](https://github.com/netdata/netdata/pull/13464), [@stelfrag](https://github.com/stelfrag)) - Add /api/v1/weights endpoint ([#13449](https://github.com/netdata/netdata/pull/13449), [@ktsaou](https://github.com/ktsaou)) - Add stats about currently collected metrics and disk space to tiering endpoint ([#13445](https://github.com/netdata/netdata/pull/13445), [@ktsaou](https://github.com/ktsaou)) - Show last 15 alerts in notification ([#13434](https://github.com/netdata/netdata/pull/13434), [@MrZammler](https://github.com/MrZammler)) - Add tiering statistics API endpoint ([#13420](https://github.com/netdata/netdata/pull/13420), [@ktsaou](https://github.com/ktsaou)) - Send chart context with alert events to the cloud ([#13409](https://github.com/netdata/netdata/pull/13409), [@MrZammler](https://github.com/MrZammler)) - Send node info message sooner ([#13348](https://github.com/netdata/netdata/pull/13348), [@MrZammler](https://github.com/MrZammler)) - Use new MQTT as default ([#13287](https://github.com/netdata/netdata/pull/13287), [@underhood](https://github.com/underhood)) - Better ACLK debug communication log ([#13281](https://github.com/netdata/netdata/pull/13281), [@underhood](https://github.com/underhood)) - Add Multi-Tier database backend for long term metrics storage ([#13263](https://github.com/netdata/netdata/pull/13263), [@stelfrag](https://github.com/stelfrag)) - Add natural and virtual points support to Query Engine ([#13248](https://github.com/netdata/netdata/pull/13248), [@ktsaou](https://github.com/ktsaou)) - Delay health until obsoletions check is complete ([#13239](https://github.com/netdata/netdata/pull/13239), [@MrZammler](https://github.com/MrZammler)) - Enable ML by default ([#13158](https://github.com/netdata/netdata/pull/13158), [@andrewm4894](https://github.com/andrewm4894)) - Add multi-granularity support to Query Engine and MC improvements ([#13155](https://github.com/netdata/netdata/pull/13155), [@ktsaou](https://github.com/ktsaou)) - Add an option to use malloc for page cache instead of mmap ([#13142](https://github.com/netdata/netdata/pull/13142), [@stelfrag](https://github.com/stelfrag)) - Significantly improve metrics correlations (73x times faster) ([#13107](https://github.com/netdata/netdata/pull/13107), [@ktsaou](https://github.com/ktsaou)) - Add SSL received/send bytes statistics to ACLK ([#13091](https://github.com/netdata/netdata/pull/13091), [@underhood](https://github.com/underhood)) </details> #### Bug fixes 🐞 Increasing Netdata's reliability one bug fix at a time. <details> <summary> Show 16 more contributions </summary> - Fix crash on Agent startup if data rotation needs to be done ([#13473](https://github.com/netdata/netdata/pull/13473), [@stelfrag](https://github.com/stelfrag)) - Fix agent crash when archived host has not been registered to the cloud ([#13437](https://github.com/netdata/netdata/pull/13437), [@stelfrag](https://github.com/stelfrag)) - Fix gap filling on dbengine gaps ([#13417](https://github.com/netdata/netdata/pull/13417), [@MrZammler](https://github.com/MrZammler)) - Fix 32bit calculation on array allocator ([#13343](https://github.com/netdata/netdata/pull/13343), [@ktsaou](https://github.com/ktsaou)) - Fix crash on start on slow disks because ml is initialized before dbengine starts ([#13342](https://github.com/netdata/netdata/pull/13342), [@ktsaou](https://github.com/ktsaou)) - Fix crash when the host_labels health line contains the name/value of a label that does not exist on the host ([#13305](https://github.com/netdata/netdata/pull/13305), [@MrZammler](https://github.com/MrZammler)) - Fix incorrect dimension names in Redis alarms ([#13296](https://github.com/netdata/netdata/pull/13296), [@ilyam8](https://github.com/ilyam8)) - Fix Query Engine alignment ([#13282](https://github.com/netdata/netdata/pull/13282), [@ktsaou](https://github.com/ktsaou)) - Fix vbi parser in mqtt5 implementation ([#13277](https://github.com/netdata/netdata/pull/13277), [@underhood](https://github.com/underhood)) - Fix alignment in charts endpoint ([#13275](https://github.com/netdata/netdata/pull/13275), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix RAM calculation on macOS in system-info ([#13260](https://github.com/netdata/netdata/pull/13260), [@ilyam8](https://github.com/ilyam8)) - Fix data query on stale chart ([#13159](https://github.com/netdata/netdata/pull/13159), [@stelfrag](https://github.com/stelfrag)) - Fix crashes due to misaligned allocations ([#13137](https://github.com/netdata/netdata/pull/13137), [@ktsaou](https://github.com/ktsaou)) - Fix buffer overflow detected by the compiler ([#13120](https://github.com/netdata/netdata/pull/13120), [@ktsaou](https://github.com/ktsaou)) - Fix 100% CPU when using SSL and a child disconnect from a parent ([#13112](https://github.com/netdata/netdata/pull/13112), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix virtualization detection on FreeBSD ([#13087](https://github.com/netdata/netdata/pull/13087), [@ilyam8](https://github.com/ilyam8)) </details> ### Code organization 🏋️ Changes to keep our code base in good shape. <details> <summary> Show 49 more contributions </summary> - Handle cases where entries where stored as text (with strftime("%s")) ([#13472](https://github.com/netdata/netdata/pull/13472), [@stelfrag](https://github.com/stelfrag)) - Get last_entry_t only when st changes ([#13448](https://github.com/netdata/netdata/pull/13448), [@MrZammler](https://github.com/MrZammler)) - Store host label information in the metadata database ([#13441](https://github.com/netdata/netdata/pull/13441), [@stelfrag](https://github.com/stelfrag)) - Fix tests so that the actual metadata database is not accessed ([#13439](https://github.com/netdata/netdata/pull/13439), [@stelfrag](https://github.com/stelfrag)) - Delete aclk_alert table on start streaming from seq 1 batch 1 ([#13438](https://github.com/netdata/netdata/pull/13438), [@MrZammler](https://github.com/MrZammler)) - Query queue only for queries ([#13431](https://github.com/netdata/netdata/pull/13431), [@underhood](https://github.com/underhood)) - Add missing comma (handle coverity warning CID 379360) ([#13413](https://github.com/netdata/netdata/pull/13413), [@stelfrag](https://github.com/stelfrag)) - Remove python.d/web_log alarms ([#13404](https://github.com/netdata/netdata/pull/13404), [@ilyam8](https://github.com/ilyam8)) - Store host system information in the database ([#13402](https://github.com/netdata/netdata/pull/13402), [@stelfrag](https://github.com/stelfrag)) - Fix coverity issue 379240 (Unchecked return value) ([#13401](https://github.com/netdata/netdata/pull/13401), [@stelfrag](https://github.com/stelfrag)) - Fix bitmap unit tests ([#13374](https://github.com/netdata/netdata/pull/13374), [@stelfrag](https://github.com/stelfrag)) - Remove python.d collectors announced in v1.35.0 deprecation notice ([#13370](https://github.com/netdata/netdata/pull/13370), [@ilyam8](https://github.com/ilyam8)) - Address Coverity issues ([#13364](https://github.com/netdata/netdata/pull/13364), [@stelfrag](https://github.com/stelfrag)) - Omit first point if not needed in Query Engine ([#13345](https://github.com/netdata/netdata/pull/13345), [@ktsaou](https://github.com/ktsaou)) - Fix coverity 379241 ([#13336](https://github.com/netdata/netdata/pull/13336), [@MrZammler](https://github.com/MrZammler)) - Add Rrdcontext in memory indexing ([#13335](https://github.com/netdata/netdata/pull/13335), [@ktsaou](https://github.com/ktsaou)) - Detect stored metric size by page type ([#13334](https://github.com/netdata/netdata/pull/13334), [@stelfrag](https://github.com/stelfrag)) - Silence compile warnings on external source ([#13332](https://github.com/netdata/netdata/pull/13332), [@MrZammler](https://github.com/MrZammler)) - Add UpdateNodeCollectors message ([#13330](https://github.com/netdata/netdata/pull/13330), [@MrZammler](https://github.com/MrZammler)) - Fix Cid 379238 379238 ([#13328](https://github.com/netdata/netdata/pull/13328), [@stelfrag](https://github.com/stelfrag)) - Fix two helgrind reports ([#13325](https://github.com/netdata/netdata/pull/13325), [@vkalintiris](https://github.com/vkalintiris)) - Add array allocator for dbengine page descriptors ([#13312](https://github.com/netdata/netdata/pull/13312), [@ktsaou](https://github.com/ktsaou)) - Protect shared variables with log lock. ([#13306](https://github.com/netdata/netdata/pull/13306), [@vkalintiris](https://github.com/vkalintiris)) - Null terminate string if file read was not successful ([#13299](https://github.com/netdata/netdata/pull/13299), [@stelfrag](https://github.com/stelfrag)) - Remove deprecated modules from python.d.conf ([#13264](https://github.com/netdata/netdata/pull/13264), [@ilyam8](https://github.com/ilyam8)) - Remove warnings while compiling ML on FreeBSD ([#13255](https://github.com/netdata/netdata/pull/13255), [@thiagoftsm](https://github.com/thiagoftsm)) - Remove strftime from statements and use unixepoch instead ([#13250](https://github.com/netdata/netdata/pull/13250), [@stelfrag](https://github.com/stelfrag)) - Updates the sqlite version in the agent ([#13233](https://github.com/netdata/netdata/pull/13233), [@stelfrag](https://github.com/stelfrag)) - Migrate data when machine GUID changes ([#13232](https://github.com/netdata/netdata/pull/13232), [@stelfrag](https://github.com/stelfrag)) - Add more sqlite unittests ([#13227](https://github.com/netdata/netdata/pull/13227), [@stelfrag](https://github.com/stelfrag)) - Add Netdata doubles ([#13217](https://github.com/netdata/netdata/pull/13217), [@ktsaou](https://github.com/ktsaou)) - Print INTERNAL BUG messages only when NETDATA_INTERNAL_CHECKS is enabled ([#13207](https://github.com/netdata/netdata/pull/13207), [@MrZammler](https://github.com/MrZammler)) - Add hostname in the worker structure to avoid constant lookups ([#13199](https://github.com/netdata/netdata/pull/13199), [@stelfrag](https://github.com/stelfrag)) - Allow for an easy way to do metadata migrations ([#13196](https://github.com/netdata/netdata/pull/13196), [@stelfrag](https://github.com/stelfrag)) - Add dictionaries with reference counters and full deletion support during traversal ([#13195](https://github.com/netdata/netdata/pull/13195), [@ktsaou](https://github.com/ktsaou)) - Add configuration for dbengine page fetch timeout and retry count ([#13194](https://github.com/netdata/netdata/pull/13194), [@stelfrag](https://github.com/stelfrag)) - Clean sqlite prepared statements on thread shutdown ([#13193](https://github.com/netdata/netdata/pull/13193), [@stelfrag](https://github.com/stelfrag)) - Set default for `minimum num samples to train` to `900` ([#13174](https://github.com/netdata/netdata/pull/13174), [@andrewm4894](https://github.com/andrewm4894)) - Remove warnings when openssl 3 is used. ([#13170](https://github.com/netdata/netdata/pull/13170), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix coverity issues ([#13168](https://github.com/netdata/netdata/pull/13168), [@stelfrag](https://github.com/stelfrag)) - Allow traversing null-value dictionaries ([#13162](https://github.com/netdata/netdata/pull/13162), [@ktsaou](https://github.com/ktsaou)) - Use memset to mark the empty words in the quoted_strings_splitter function ([#13161](https://github.com/netdata/netdata/pull/13161), [@stelfrag](https://github.com/stelfrag)) - Fix labels unit test ([#13156](https://github.com/netdata/netdata/pull/13156), [@stelfrag](https://github.com/stelfrag)) - Use ks2 as MC default ([#13131](https://github.com/netdata/netdata/pull/13131), [@andrewm4894](https://github.com/andrewm4894)) - Allow label names to have slashes ([#13125](https://github.com/netdata/netdata/pull/13125) , [@ktsaou](https://github.com/ktsaou)) - Fix coveriry 379136 379135 379134 379133 ([#13123](https://github.com/netdata/netdata/pull/13123), [@ktsaou](https://github.com/ktsaou)) - Removes Legacy JSON Cloud Protocol Support In Agent ([#13111](https://github.com/netdata/netdata/pull/13111), [@underhood](https://github.com/underhood)) - Add labels with dictionary ([#13070](https://github.com/netdata/netdata/pull/13070), [@ktsaou](https://github.com/ktsaou)) - Fix coverity 378587 ([#13024](https://github.com/netdata/netdata/pull/13024), [@MrZammler](https://github.com/MrZammler)) </details> ## Deprecation notice <a id="v1360-deprecation-notice"></a> The following items will be removed in our next minor release (v1.37.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |----------------------------------------------------------------------------------------------------------|:---------:|:------------------------------------------------------------------------------------:| | [python.d/postgres](https://github.com/netdata/netdata/tree/v1.35.1/collectors/python.d.plugin/postgres) | collector | [go.d/postgres](https://github.com/netdata/go.d.plugin/tree/master/modules/postgres) | All the deprecated components will be moved to the [netdata/community](https://github.com/netdata/community) repository. ### Deprecated in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.35.0#v1350-deprecation-notice), the following items have been removed in this release: | Component | Type | Replaced by | |------------------------------------------------------------------------------------------------------------------------|:---------:|:--------------------------------------------------------------------------------------------------------:| | [python.d/chrony](https://github.com/netdata/netdata/tree/v1.35.0/collectors/python.d.plugin/chrony) | collector | [go.d/chrony](https://github.com/netdata/go.d.plugin/tree/master/modules/chrony) | | [python.d/ovpn_status_log](https://github.com/netdata/netdata/tree/v1.35.0/collectors/python.d.plugin/ovpn_status_log) | collector | [go.d/openvpn_status_log](https://github.com/netdata/go.d.plugin/tree/master/modules/openvpn_status_log) | ## Netdata Release Meetup <a id="v1360-release-meetup"></a> Join the Netdata team on the **11th of August** for the **Netdata Agent Release Meetup**, which will be held on the [Netdata Discord](https://discord.gg/pnpjpwfE?event=983676714062315560). Together we’ll cover: - Release Highlights - Acknowledgements - Q&A with the community [RSVP now](http://link.clicks.meetup.com/ls/click?upn=XbaZ37larFA-2FuV5MohrYpdrra25MtI4CzodbRR1Rd1mWsMzCapiG0iniZ1RRDZa7M-2FmHJLr-2Bb1mwUmcBaBZ8Cam3PoKxQZRiJnTtMl1cO8m8wO4IoeZ6hjSnp6JFSdQGaVWBUc-2BAiXyf81aG6v8Ve5EqfNbzCZ5wmt3hTmZrMJp-2Fl2hrLKdtCxhQeP72Gsg5fNWiyjP2Nxsrv5dtQoonvkpBp5mn0X4fqzjgeOcKntWZLp7DvemM3TT1awYmyc3YJ5R-2FZRl-2BUv320g16xCVILGLGF-2Fz53ZcKuCbjUUu3NjyPDEQ-2BxudxUvSQB7Sin79-2BQtp-2FtPK8VDiUqgFLZx2Sew-3D-3D1J0E_YqVQspEN-2BKecRhYybdJ9snXNet7-2BJNbsxhQt5sGUhHn-2B3h0sjCoLQZFpLyEEi06yW8HtSWUqFQ8RX94qvnbgZuLdW8uYWdad2gjc5U8fb4luM-2F9J4pMkRjOpBlzC5U2p9nzpAFxCsRzOfWvFm34xgVkn0cMFHwGbklq4RhCMXUKisd07b9Mejd8tVIUYOtlkdB2-2BFsHKwwhSVq1fmV0Ea9HCxXuNmhcqIB31iyRU3wY9rwoofkCa8QUfXHuQJI780ODo7WV8OoPQBUR59f8ULt5MUTi9VWHAOqgoM3XdgHibL-2BY3ptMEC-2FnTb2T01ZUKsZFP5YsAe5JYvsX4SVTOElQufU6SVr-2BRcJ8LgiyX6abJJ-2F7DVZAS-2FI-2F4UT8Ic2cXBz72bgUhj8RTgLkh7qQsqRnKg0g37v-2F-2BOAlgbx1-2FbxdSHEbJnHxUqbL-2Ft0i2QclB-2FmbKPqnX-2BUL4VmrH9oR1UFjdByh0-2BGSs0zIVj-2FRM49UJ3-2F-2FcpAReC04YJl-2BOjRIb23fgMbnN5N-2BMp6pDVJ8oaFIZkVX5zbpKLc3xQ1n0b4RCLV3kMUoOon-2FELqbuK3hN5G9sKNNBc98bbI2WFXmNOd60J5Z6DLHIwL3jiEcLP89uaiQpYfe9Rj5mp-2F5lyd2Dv91pCYRCO1upoQJ-2Fz-2BDZOcauZyTSqFWGiGp14jutpN7fBt-2FC2mrEYgx6-2B0Ahw7CQoDwWo84e4Kl0xTi7W-2B9WXCKXNKoCrSFpx6bZGKtQgn0-3D) We look forward to meeting you. ## Support options <a id="v1360-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [Github Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [Github Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1100 engineers are already using it! 2022-08-10T20:04:46+00:00 netdata v1.36.1 netdata v1.36.1 2022-08-15T16:41:05+00:00 # Release v1.36.1 Netdata v1.36.1 is a patch release to address two issues discovered since v1.36.0. [Refer to the v.1.36.0 release notes](https://github.com/netdata/netdata/releases/tag/v1.36.0) for the full scope of that release. The v1.36.1 patch release fixes the following: - An issue that could cause agents running on 32bit distributions to crash during data exchange with the cloud ([PR #13511](https://github.com/netdata/netdata/pull/13511)). - An issue with the handling of the Go plugin in the installer code that prevented the new WireGuard collector from working without user intervention ([PR # 13507](https://github.com/netdata/netdata/pull/13507)). ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud/): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [Github Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [Github Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1100 engineers are already using it! 2022-08-15T16:41:05+00:00 netdata v1.37.0 netdata v1.37.0 2022-11-30T18:12:17+00:00 ## **IMPORTANT NOTICE** <a id="v1370-important-security-notice"></a> This release fixes two security issues, one in streaming authorization and another at the execution of alarm notification commands. **All users are advised to update to this version or any later!** Credit goes to [Stefan Schiller](https://github.com/stefan-schiller-sonarsource) of SonarSource.com for identifying both of them. Thank you, Stefan! ## Netdata release v1.37 introduction Another release of the Netdata Monitoring solution is here! We focused on these key areas: 1. [**Infinite scalability**](#v1370-inifinite) of the Netdata Ecosystem 2. Default [**Database Tiering**](#v1370-db-retention), offering **months of data retention** for typical Netdata Agent installations with default settings and **years of data retention** for dedicated Netdata Parents. 3. [**Overview Dashboards**](#v1370-overview-dash) at Netdata Cloud got a ton of improvements to allow **slicing and dicing of data directly on the UI** and overcome the limitations of the web technology when thousands of charts are presented on one page. 4. Integration with [**Grafana for custom dashboards**](#v1370-grafana-plugin), using Netdata Cloud as an infrastructure-wide time-series data source for metrics 5. [**PostgreSQL** monitoring](#v1370-postgressql) completely rewritten offering state-of-the-art monitoring of the database performance and health, even at the table and index level. Read more about this release in the following sections! **Table of contents** <!-- To link within the page, the anchors need to be HTML tags: <a unique_id="vXX.X.X-topic-id"></a> We include all of the H2,H3 in the TOC. The Headings of the release notes we will include to the TOC tree is up to the Release notes manager. --> - [Release Highlights](#v1370-release-highlights) - [Infinite scalability](#v1370-inifinite) - [Database retention](#v1370-db-retention) - [New and improved system service integration](#v1370-system-service) - [Plugins function extension](#v1370-plugins-extension) - [Disk based data indexing](#v1370-disk-indexing) - [Overview dashboard](#v1370-overview-dash) - [Single node dashboard improvements](#v1370-single-node-dash) - [Netdata data source plugin for Grafana](#v1370-grafana-plugin) - [New Unseen node state](#v1370-unseen) - [Blogposts & demo space use-case rooms](#v1370-blog-use-case) - [Tech debt and performance improvements](#v1370-tech-debt) - [Internal Improvements](#v1370-internal) - [Acknowledgments](#v1370-ack) - [Contributions](#v1370-contributions) - [Deprecation and product notices](#v1370-deprecation) - [Netdata release meetup](#v1370-release-meetup) - [Support options](#v1370-support-options) > ❗ We're keeping our codebase healthy by removing features that are end of life. Read the [deprecation notices](#v1370-deprecation) to check if you are affected. ### Netdata open-source growth <a id="v1360-open-source-growth"></a> - Over 61,000 GitHub Stars - Almost four million monitored servers - Almost 85 million sessions served - Rapidly approaching a half million total nodes in Netdata Cloud ## Release highlights <a id="v1370-release-highlights"></a> ### Infinite scalability <a id="v1370-inifinite"></a> Scalability is one of the biggest challenges of monitoring solutions. Almost every commercial or open-source solution assumes that metrics should be centralized to a time-series database, which is then queried to provide dashboards and alarms. This centralization, however, has two key problems: 1. The scalability of the monitoring solutions is significantly limited, since growing these central databases can quickly become tricky, if it is possible at all. 2. To improve scalability and control the monitoring infrastructure cost, almost all solutions limit granularity (the data collection frequency) and cardinality (the number of metrics monitored). At Netdata we love **high fidelity** monitoring. We want granularity to be "per second" as a standard for all metrics, and we want to monitor as many metrics as possible, without limits. <details> <summary>Read more about our improvements to scalability</summary> The only way to achieve our goal is by scaling out. Instead of centralizing everything into one huge time-series database, we have many smaller centralization points that can be used seamlessly all together like a giant distributed database. **This is what Netdata Cloud does!** It connects to all your Netdata agents and seamlessly aggregates data from all of them to provide infrastructure and service level dashboards and alarms. ![image](https://user-images.githubusercontent.com/2662304/199225735-01a41cc5-c074-4fe2-b780-5f08e92c6769.png) Netdata Cloud does not collect or store all the data collected; that is one of its most beautiful and unique qualities. It only needs active connections to the Netdata Agents having the metrics. The Netdata Agents store all metrics in their own time-series databases (we call it [`dbengine`](https://learn.netdata.cloud/docs/agent/database/engine#tiering-in-a-nutshell), and it is embedded into the Netdata Agents). In this release, we introduce a new way for the Agents to communicate their metadata to the cloud. To minimize the amount of traffic exchanged between Netdata Cloud and Agents, we only transfer a very limited information of metadata. We call this information `contexts`, and it is pretty much limited to the unique metric names collected, coupled with the actual retention (first and last timestamps) that each agent has available for query. At the same time, to overcome the limitations of having hundreds of thousands of Agents concurrently connected to Netdata Cloud, we are now using EMQX as the message broker that connects Netdata Agents to Netdata Cloud. As the community grows, the next step planned is to have such message brokers in five continents, to minimize the round-trip latency for querying Netdata Agents through Netdata Cloud. We also see [**Netdata Parents**](https://learn.netdata.cloud/docs/agent/streaming) as a key component of our ecosystem. A Netdata Parent is a Netdata Agent that acts as a centralization point for other Netdata Agents. The idea is simple: any Netdata Agent (Child) can delegate all its functions, except data collection, to any other Netdata Agent (Parent), and by doing so, the latter now becomes a Netdata Parent. This means that metrics storage, metrics querying, health monitoring, and machine learning can be handled by the Netdata Parent, on behalf of the Netdata Children that push metrics to it. This functionality is crucial for our ecosystem for the following reasons: 1. Some nodes are **ephemeral** and may vanish at any point in time. But we need their metric data. 2. Other nodes may be **too sensitive** to run all the features of a Netdata Agent. On such nodes we needed a way to use the absolute minimum of system resources for anything else except the core application that the node is hosting. So, on these Netdata Agents we can disable metrics storage, health monitoring, machine learning and push all metrics to another Netdata Agent that has the resources to spare for these tasks. 3. **High availability** of metric data. In our industry, "one = none." We need at least 2 of everything and this is true for metric data too. Parents allow us to replicate databases, even having different retention on each, thus significantly improving the availability of metrics data. **In this release** we introduce significant improvements to Netdata Parents: 1. **Streaming Compression**<br/>The communication between Netdata Agents is now compressed using LZ4 streaming compression, saving more than 70% of the bandwidth. TLS communication was already implemented and can be combined with compression. 2. **Active-Active Parents Clusters**<br/>A Parent cluster of 2+ nodes can be configured by linking each of the parents to the others. Our configuration can easily take care of the circular dependency this implies. For 2 nodes you configure: A->B and B<-A. For 3 nodes: A->B/C, B->A/C, C->A/B. Once the parents are set up, configure Netdata Agents to push metrics to any of them (for 2 Parent nodes: A/B, for 3 Parent nodes: A/B/C). Each Netdata Agent will send metrics to only one of the configured parents at a time. But any of them. Then the Parent agents will re-stream metrics to each other. 3. **Replication of past data**<br/>Now Parents can request missing data from each other and the origin data collecting Agent. This works seamlessly when two agents connect to each other (both have to be the latest version). They exchange information about the retention each has and they automatically fill in the gaps of the Parent agent, ensuring no data are lost at the Parents, even if a Parent was offline for some time (the default max replication duration is 1 day, but it can be tuned in `stream.conf` - and the connecting Agent Child needs to have data for at least that long in order for them to be replicated). 4. **Performance Improvements**<br/>Now Netdata Parents can digest about 700k metric values per second per origin Agent. This is a huge improvement over the previous one of 400k. Also, when establishing a connection, the agents can accept about 2k metadata definitions per second per origin Agent. We moved all metadata management to a separate thread, and now we are experiencing 80k metric definitions per second per origin Agent, making new Agent connections enter the metrics streaming phase almost instantly. All these improvements establish a huge step forward in providing an infinitely scalable monitoring infrastructure. </details> ### Database retention <a id="v1370-db-retention"></a> Many users think of Netdata Agent as an amazing single node-monitoring solution, offering limited real-time retention to metrics. This changed slightly over the years as we introduced [`dbengine`](https://learn.netdata.cloud/docs/agent/database/engine#tiering-in-a-nutshell) for storing metrics and even with the introduction of [database tiering](https://learn.netdata.cloud/guides/longer-metrics-storage#tiering) at the previous release, allowing Netdata to downscale metrics and store them for a longer duration. As of this release, we now enable tiering by default! So, a typical Netdata Agent installation, with default settings, will now have 3 database tiers, **offering a retention of about 120 - 150 days**, using just 0.5 GB of disk space! This is coupled with another significant achievement. Traditionally, the Agent dashboard showed only currently collected metrics. The dashboard of Netdata Cloud however, should present all the metrics that were available for the selected time-frame, independently of whether they are currently being collected or not. This is especially important for highly volatile environments, like [Kubernetes](https://learn.netdata.cloud/docs/cloud/visualize/kubernetes), that metrics come and go all the time. So, in this release, we rewrote the query engine of the Netdata Agent to properly query metrics independently of them being currently collected or not. In practice, the Agent is now sliced in two big modules: data collection and querying. These two parts do not depend on each other any more, allowing dashboards to query metrics for any time-frame there are data available. This feature of querying past data even for non-collected metrics is available now via Netdata Cloud Overview dashboards. ### New and improved system service integration <a id="v1370-system-service"></a> We have completely rewritten the part of the installer responsible for setting up Netdata as a system service. This includes a number of major improvements over the old code, including the following: - Instead of deciding which type of system service to install based on the distribution name and release, we now actively detect which service manager is in use and use that. This provides significantly better behavior on non-systemd systems, many of which were not actually getting the correct service type installed. - On FreeBSD systems, we now correctly install the rc.d script for Netdata to `/usr/local/etc/rc.d` instead of `/etc/rc.d`. - We now correctly enable and disable the agent as a system service correctly for all service managers we officially support. In particular, this means that users who are using a supported service manager should not need to do anything to enable the service. - Similarly, we now properly start the agent through the system service manager for all supported service managers. - We now have improved support for installing as a system service under WSL, including support for systemd in WSL, and correct fallbacks to LSB or initd style init scripts. This should make using Netdata under WSL much easier. - We now support installing service files for Netdata on offline systemd or OpenRC systems. This should greatly simplify installing the agent in containers or as part of setting up a virtual machine template. - Numerous minor improvements. Additionally, this release includes a number of improvements to our OpenRC init script, bringing it more in-line with best practices for OpenRC init scripts, fixing a handful of bugs, and making it easier to run Netdata under OpenRC’s native process supervision. We plan to continue improving this area in upcoming release cycles as well, including further improvements to our OpenRC support and preliminary support for installing Netdata as a service on systems using Runit. ### Plugins function extension <a id="v1370-plugins-extension"></a> As of this release, plugins can now register functions to the agent that can be executed on demand to provide real time, detailed and specific chart data. Via [streaming](#v1370-streaming-replication), the definitions of functions are now transmitted to a parent and seamlessly exposed to the agent. ### Disk based data indexing <a id="v1370-disk-indexing"></a> Agents now build an optimized disk-based index file to reduce memory requirements up to 90%. In turn, the Agent startup time improved by 1,000% (You read this right; this is not a typo!). ### **Overview** dashboard <a id="v1370-overview-dash"></a> The [**Overview** dashboard](https://learn.netdata.cloud/docs/cloud/visualize/overview) is the key dashboard of the Netdata ecosystem. We are constantly putting effort into improving this dashboard so that it will eventually be unnecessary to use anything else. Unlike the Netdata Agent dashboard, the Netdata Cloud **Overview** dashboard is multi-node, providing infrastructure and service level views of the metrics, seamlessly aggregating and correlating metrics from all Netdata Agents that participate in a war room. We believe that dashboards should be fully automated and out-of-the-box, providing all the means for slicing and dicing data without learning any query language, without editing chart definitions, and without having a deep understanding of the underlying metrics, so that the monitoring system is fully functional and ready to be used for troubleshooting the moment it is installed. <details> <summary>Read more about our improvements to the Overview dashboard</summary> Moving towards this goal, in this release we introduce the following improvements: 1. A complete rewrite of the underlying core of the dashboard offers now huge **performance improvements** on dashboards with thousands of charts. Before this work, when the dashboard had thousands of charts, several seconds were required to jump from the top of the dashboard to the end. Now it is instant. 2. We went through all the data collection plugins and metrics and we **added labels** to all of them, allowing the default charts on the **Overview** dashboard to pivot the charts, **slicing and dicing** the data according to these labels. For example, network interfaces charts can be pivoted by device name or interface type, while at the same time filtered by any of the labels, dimensions, instances or nodes. ![image](https://user-images.githubusercontent.com/2662304/199255851-2258c5cf-77a1-4a6b-999c-e325532ef7df.png) 4. We have started working on new **summary tiles** to outlook the sections of the dashboard in a more dynamic manner. This work has just started and we expect to introduce a lot of new changes heading into the next releease![image](https://user-images.githubusercontent.com/2662304/199256852-bdcc78d8-6061-4f1b-be9f-9cc358fa47d4.png) </details> ### Single node dashboard improvement <a id="v1370-single-node-dash"></a> The Single Node view dashboard now uses the same engine as the [Overview](#v1370-overview-dash). With this, you get a more consistent experience, but also: * The ability to run metric correlations across many nodes in your infrastructure. * All the grouping and filtering functions of the overview. * Reduced memory usage on the agent, as the old endpoints get deprecated. We are working to bring similar improvements to the local Agent dashboard. In the meantime, it will look different than the Single Node view on Netdata Cloud. On Netdata Cloud we use composite charts, instead of separate charts, for each instance. ![image](https://user-images.githubusercontent.com/82235632/200645561-6acb415b-6436-4d9e-a9b1-f8d6c7847ed3.png) ### Netdata data source plugin for Grafana <a id="v1370-grafana-plugin"></a> This initial release of the Netdata data source plugin aims to maximize the troubleshooting capabilities of Netdata in Grafana, making them more widely available. It combines Netdata’s powerful collector engine with Grafana's amazing visualization capabilities! <details> <summary>Read more about our source plugin for Grafana</summary> ![explorer_9ae3iwJHsD](https://user-images.githubusercontent.com/82235632/196962412-0b87be39-cfa4-419d-8959-96104803b4a4.png) We expect that the Open-Source community will take a lot of value from this plugin, so we don’t plan on stopping here. We want to keep improving this plugin! We already have some enhancements on our backlog, including the following plans: * Enabling variable functionality * Allowing filtering with multiple key-value combinations) * Providing sample templates for certain use-cases, e.g. monitoring PostgreSQL We would love to get you involved in this project! If you have ideas on things you'd like to see or just want to share a cool dashboard you've setup, you're more than welcome to [contribute](https://github.com/netdata/netdata-grafana-datasource-plugin). Check out our [blogpost](https://www.netdata.cloud/blog/introducing-netdata-source-plugin-for-grafana) and [YouTube video](https://www.youtube.com/watch?v=uhvrnFbHlvk) on this new plugin to see how it can work best for you. </details> ### New `Unseen` node state <a id="v1370-unseen"></a> To provide better visibility on different causes for why a node is **Offline**, we broke this status in to two separate statuses, so that you can now distinguish cases where a node _never_ connected to Netdata Cloud successfully. The following list presents our current node's statuses and their meaning: * **Live:** Node is actual collecting and streaming metrics to Cloud * **Stale:** Node is currently offline and no streaming metrics to Cloud. It can show historical data from a parent node * **Offline:** Node is currently offline, not streaming metrics to Cloud and not available in any parent node * **Unseen:** Nodes have never been connected to Cloud, they are claimed but no successful connection was established <image src="https://user-images.githubusercontent.com/82235632/196972603-4a967979-6cc5-47d6-af78-249365107bb4.png" height="65%" width="65%"/> There are different reasons why a node can't connect; the most common explanation for this falls into one of the following three categories: * The claiming process of the kickstart script was unsuccessful * Claiming on an older, deprecated version of the Agent * Network issues while connecting to the Cloud For some guidelines on how to solve these issues, check our [docs here](https://learn.netdata.cloud/guides/troubleshoot/troubleshooting-agent-with-cloud-connection). ### Blogposts & Demo space use-case rooms <a id="v1370-blog-use-case"></a> To better showcase the potentialities and upgrades of Netdata, we have made available multiple rooms in our [Demo space](https://app.netdata.cloud/spaces/netdata-demo/) to allow you to experience the power and simplicity of Netdata with live infrastructure monitoring. #### PostgreSQL monitoring <a id="v1370-postgressql"></a> Netdata's new PostgreSQL collector offers a fully revamped comprehensive PostgreSQL DB monitoring experience. 100+ PostrgreSQL metrics are collected and visualized across 60+ composite charts. Netdata now collects metrics at per database, per table and per index granularity (besides the metrics that are global to the entire DB cluster) and lets users explore which table or index has a specific problem such as high cache miss, low rows fetched ratio (indicative of missing indexes) or bloat that's eating up valuable space. The new collector also includes built-in alerts for several problem scenarios that a user is likely to run into on a PostgreSQL cluster. For more information, read our [docs](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/postgres) or our [blog](https://blog.netdata.cloud/postgresql-monitoring/)for a deep dive into PostgreSQL and why these metrics matter. ![image](https://user-images.githubusercontent.com/24860547/200192667-756a30ae-d9f7-46e7-8195-fbea50215c42.png) #### Redis monitoring Netdata's Redis collector was updated to include new metrics crucial for database performance monitoring such as latency and new built-in alerts. For the full list of Redis metrics now available, read our [docs](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis) or our [blog ](https://www.netdata.cloud/blog/redis-monitoring) for a deeper dive into Redis monitoring. ![image](https://user-images.githubusercontent.com/24860547/200192628-73d0bf35-9de4-4916-9c3a-b208b4ccfb24.png) #### Cassandra monitoring Netdata now monitors Cassandra, and comes with 25+ charts for all key Cassandra metrics. The collected metrics include throughput, latency, cache (key cache + row cache), disk usage and compaction, as well as JVM runtime metrics such as garbage collection. Any potential errors and exceptions that occur on your Cassandra cluster are also monitored. For more information read our [docs](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/cassandra) or our [blog](https://blog.netdata.cloud/cassandra-monitoring-part2/). ![image](https://user-images.githubusercontent.com/24860547/200192648-75bb5e92-add7-4930-a191-a5f45829cb16.png) ### Tech debt and Infrastructure improvements <a id="v1370-tech-debt"></a> To further improve Netdata Cloud and your user experience, multiple points around tech debt and infrastructure improvements have been completed. To name some of the key achievements: * An huge improvement has been made on our **Overview** tab on Netdata Cloud; we improved the performance around the navigation on the **Table of Contents (TOC)** and the charts on the viewport, contributing to a much better UX * The repos that support our FE have all been upgraded to node 16, putting us on the Active Long Term Support (LTS) version * We've replaced our MQTT broker VerneMQ with EMQX, which brings much more stability to the product. ### Internal improvements <a id="v1370-internal"></a> #### Asynchronous storing of metadata We have improved the speed of chart creation by 70x. According to lab tests creating 30,000 charts with 10 dimensions each, we achieved a chart creation rates of 7000 charts/second (vs 100 charts/second prior) #### Per host alert processing. Alert processing for a host (e.g. child connected to a parent) is now done on its own host. Time-consuming health related initialization functions are deferred as needed and parallelized to improve performance. #### Dictionary code improvements Code improvements have been made to make use of dictionaries, better managing the life cycle of objects (creation, usage, and destruction using reference counters) and reducing explicit locking to access resources. ## Acknowledgments <a id="v1370-ack"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer is essential to our success. We thank you and look forward to continue to grow together to build a remarkable product. - [@HG00](https://github.com/HG00) for improving RabbitMQ collector readme. - [@KickerTom](https://github.com/KickerTom) for improving Makefiles. - [@MAH69IK](https://github.com/MAH69IK) for adding an option to retry on telegram API limit error. - [@Pulseeey](https://github.com/Pulseeey) for adding CloudLinux OS detection during installation and update. - [@candrews](https://github.com/candrews) for improving netdata.service. - [@uplime](https://github.com/uplime) for fixing a typo in netdata-installer.sh. - [@vobruba-martin](https://github.com/vobruba-martin) for adding TCP socket connection support and the state path modification. - [@yasharne](https://github.com/yasharne) for adding ProxySQL collector. ## Contributions <a id="v1370-contributions"></a> ### Collectors ⚙️ Enhancing our collectors to collect all the data you need. #### New collectors <details> <summary>Show 9 more contributions</summary> - Add Pandas collector (python.d/pandas) ([#13773](https://github.com/netdata/netdata/pull/13773), [@andrewm4894](https://github.com/andrewm4894)) - Add NGINX Plus collector (go.d/nginxplus) ([#992](https://github.com/netdata/go.d.plugin/pull/992), [@ilyam8](https://github.com/ilyam8)) - Add NVMe collector (go.d/nvme) ([#973](https://github.com/netdata/go.d.plugin/pull/973), [@ilyam8](https://github.com/ilyam8)) - Add Ping collector (go.d/ping) ([#952](https://github.com/netdata/go.d.plugin/pull/952), [@ilyam8](https://github.com/ilyam8)) - Add Cassandra collector (go.d/cassandra) ([#901](https://github.com/netdata/go.d.plugin/pull/901), [@thiagoftsm](https://github.com/thiagoftsm)) - Add systemd-logind collector (go.d/logind) ([#786](https://github.com/netdata/go.d.plugin/pull/786), [@ilyam8](https://github.com/ilyam8)) - Add Docker collector (go.d/docker) ([#760](https://github.com/netdata/go.d.plugin/pull/760), [@ilyam8](https://github.com/ilyam8)) - Add PgBouncer collector (go.d/pgbouncer) ([#748](https://github.com/netdata/go.d.plugin/pull/748), [@ilyam8](https://github.com/ilyam8)) - Add ProxySQL collector (go.d/proxysql) ([#703](https://github.com/netdata/go.d.plugin/pull/703), [@yasharne](https://github.com/yasharne)) </details> #### Improvements 🐞 Improving our collectors one bug fix at a time. <details> <summary>Show 71 more contributions</summary> - Allow statsd tags to modify chart metadata on the fly (stats.d.plugin) ([#14014](https://github.com/netdata/netdata/pull/14014), [@ktsaou](https://github.com/ktsaou)) - Add Cassandra icon to dashboard info (go.d/cassandra) ([#13975](https://github.com/netdata/netdata/pull/13975), [@ilyam8](https://github.com/ilyam8)) - Add ping dashboard info and alarms (go.d/ping) ([#13916](https://github.com/netdata/netdata/pull/13916), [@ilyam8](https://github.com/ilyam8)) - Add WMI Process dashboard info (go.d/wmi) ([#13910](https://github.com/netdata/netdata/pull/13910), [@thiagoftsm](https://github.com/thiagoftsm)) - Add processes dashboard info (go.d/wmi) ([#13910](https://github.com/netdata/netdata/pull/13910), [@thiagoftsm](https://github.com/thiagoftsm)) - Add TCP dashboard description (go.d/wmi) ([#13878](https://github.com/netdata/netdata/pull/13878), [@thiagoftsm](https://github.com/thiagoftsm)) - Add Cassandra dashboard description (go.d/cassandra) ([#13835](https://github.com/netdata/netdata/pull/13835), [@thiagoftsm](https://github.com/thiagoftsm)) - Respect NETDATA_INTERNALS_MONITORING (python.d.plugin) ([#13793](https://github.com/netdata/netdata/pull/13793), [@ilyam8](https://github.com/ilyam8)) - Add ZFS hit rate charts (proc.plugin) ([#13757](https://github.com/netdata/netdata/pull/13757), [@vlvkobal](https://github.com/vlvkobal)) - Add alarms filtering via config (python.d/alarms) ([#13701](https://github.com/netdata/netdata/pull/13701), [@andrewm4894](https://github.com/andrewm4894)) - Add ProxySQL dashboard info (go.d/proxysql) ([#13669](https://github.com/netdata/netdata/pull/13669), [@ilyam8](https://github.com/ilyam8)) - Update PostgreSQL dashboard info (go.d/postgres) ([#13661](https://github.com/netdata/netdata/pull/13661), [@ilyam8](https://github.com/ilyam8)) - Add _collect_job label (job name) to charts (python.d.plugin) ([#13648](https://github.com/netdata/netdata/pull/13648), [@ilyam8](https://github.com/ilyam8)) - Re-add chrome to the webbrowser group (apps.plugin) ([#13642](https://github.com/netdata/netdata/pull/13642), [@Ferroin](https://github.com/Ferroin)) - Add labels to charts (tc.plugin) ([#13634](https://github.com/netdata/netdata/pull/13634), [@ktsaou](https://github.com/ktsaou)) - Improve the gui and email app groups and improve GUI coverage (apps.plugin) ([#13631](https://github.com/netdata/netdata/pull/13631), [@Ferroin](https://github.com/Ferroin)) - Update Postgres "connections" dashboard info (go.d/postgres) ([#13619](https://github.com/netdata/netdata/pull/13619), [@ilyam8](https://github.com/ilyam8)) - Assorted updates for apps_groups.conf (apps.plugin) ([#13618](https://github.com/netdata/netdata/pull/13618), [@Ferroin](https://github.com/Ferroin)) - Add spiceproxy to proxmox group (apps.plugin) ([#13615](https://github.com/netdata/netdata/pull/13615), [@ilyam8](https://github.com/ilyam8)) - Improve coverage of Linux kernel threads (apps.plugin) ([#13612](https://github.com/netdata/netdata/pull/13612), [@Ferroin](https://github.com/Ferroin)) - Improve dashboard info for WAL and checkpoints (go.d/postgres) ([#13607](https://github.com/netdata/netdata/pull/13607), [@shyamvalsan](https://github.com/shyamvalsan)) - Update logind dashboard info (go.d/logind) ([#13597](https://github.com/netdata/netdata/pull/13597), [@ilyam8](https://github.com/ilyam8)) - Add collecting power state (python.d/nvidia_smi) ([#13580](https://github.com/netdata/netdata/pull/13580), [@ilyam8](https://github.com/ilyam8)) - Improve PostgreSQL dashboard info (go.d/postgres) ([#13573](https://github.com/netdata/netdata/pull/13573), [@shyamvalsan](https://github.com/shyamvalsan)) - Add apt group to apps_groups.conf (apps.plguin) ([#13571](https://github.com/netdata/netdata/pull/13571), [@andrewm4894](https://github.com/andrewm4894)) - Add more monitoring tools to apps_groups.conf (apps.plugin) ([#13566](https://github.com/netdata/netdata/pull/13566), [@andrewm4894](https://github.com/andrewm4894)) - Add docker dashboard info (go.d/docker) ([#13547](https://github.com/netdata/netdata/pull/13547), [@ilyam8](https://github.com/ilyam8)) - Add discovering chips, and features at runtime (python.d/sensors) ([#13545](https://github.com/netdata/netdata/pull/13545), [@ilyam8](https://github.com/ilyam8)) - Add summary dashboard for PostgreSQL (go.d/postgres) ([#13534](https://github.com/netdata/netdata/pull/13534), [@shyamvalsan](https://github.com/shyamvalsan)) - Add jupyter to apps_groups.conf (apps.plugin) ([#13533](https://github.com/netdata/netdata/pull/13533), [@andrewm4894](https://github.com/andrewm4894)) - Improve performance and add co-re support for more modules (ebpf.plugin) ([#13530](https://github.com/netdata/netdata/pull/13530), [@thiagoftsm](https://github.com/thiagoftsm)) - Use LVM UUIDs in chart ids for logical volumes (proc.plugin) ([#13525](https://github.com/netdata/netdata/pull/13525), [@vlvkobal](https://github.com/vlvkobal)) - Reduce CPU and memory usage (ebpf.plugin) ([#13397](https://github.com/netdata/netdata/pull/13397), [@thiagoftsm](https://github.com/thiagoftsm)) - Add 'domain' label to charts (go.d/whoisquery) ([#1002](https://github.com/netdata/go.d.plugin/pull/1002), [@ilyam8](https://github.com/ilyam8)) - Add 'source' label to charts (go.d/x509check) ([#1001](https://github.com/netdata/go.d.plugin/pull/1001), [@ilyam8](https://github.com/ilyam8)) - Add 'host' label to charts (go.d/portcheck) ([#1000](https://github.com/netdata/go.d.plugin/pull/1000), [@ilyam8](https://github.com/ilyam8)) - Add 'url' label to charts (go.d/httpcheck) ([#999](https://github.com/netdata/go.d.plugin/pull/999), [@ilyam8](https://github.com/ilyam8)) - Remove pipeline instance from family and add it as a chart label (go.d/logstash) ([#998](https://github.com/netdata/go.d.plugin/pull/998), [@ilyam8](https://github.com/ilyam8)) - Add http cache io/iops metrics (go.d/nginxplus) ([#997](https://github.com/netdata/go.d.plugin/pull/997), [@ilyam8](https://github.com/ilyam8)) - Add resolver metrics (go.d/nginxplus) ([#996](https://github.com/netdata/go.d.plugin/pull/996), [@ilyam8](https://github.com/ilyam8)) - Add MSSQL metrics (go.d/wmi) ([#991](https://github.com/netdata/go.d.plugin/pull/991), [@thiagoftsm](https://github.com/thiagoftsm)) - Add IIS data collection job (go.d/web_log) ([#977](https://github.com/netdata/go.d.plugin/pull/977), [@thiagoftsm](https://github.com/thiagoftsm)) - Add IIS metrics (go.d/wmi) ([#972](https://github.com/netdata/go.d.plugin/pull/972), [@thiagoftsm](https://github.com/thiagoftsm)) - Add services metrics (go.d/wmi) ([#961](https://github.com/netdata/go.d.plugin/pull/961), [@thiagoftsm](https://github.com/thiagoftsm)) - Resolve 'hostname' in job name (go.d.plugin) ([#959](https://github.com/netdata/go.d.plugin/pull/959), [@ilyam8](https://github.com/ilyam8)) - Add processes metrics (go.d/wmi) ([#953](https://github.com/netdata/go.d.plugin/pull/953), [@thiagoftsm](https://github.com/thiagoftsm)) - Resolve 'hostname' in URL (go.d.plugin) ([#941](https://github.com/netdata/go.d.plugin/pull/938), [@ilyam8](https://github.com/ilyam8)) - Add TCP metrics (go.d/wmi) ([#938](https://github.com/netdata/go.d.plugin/pull/938), [@thiagoftsm](https://github.com/thiagoftsm)) - Add collection of Table_open_cache_overflows (go.d/dns_query) ([#936](https://github.com/netdata/go.d.plugin/pull/936), [@ilyam8](https://github.com/ilyam8)) - Allow to set a list of record types in config (go.d/dns_query) ([#912](https://github.com/netdata/go.d.plugin/pull/912), [@ilyam8](https://github.com/ilyam8)) - Create a chart per server instead of a dimension per server (go.d/dns_query) ([#911](https://github.com/netdata/go.d.plugin/pull/911), [@ilyam8](https://github.com/ilyam8)) - Respect NETDATA_INTERNALS_MONITORING env variable (go.d.plugin) ([#908](https://github.com/netdata/go.d.plugin/pull/908), [@ilyam8](https://github.com/ilyam8)) - Add query status chart (go.d/dns_query) ([#903](https://github.com/netdata/go.d.plugin/pull/903), [@ilyam8](https://github.com/ilyam8)) - Add collection of agent metrics (go.d/consul) ([#900](https://github.com/netdata/go.d.plugin/pull/900), [@ilyam8](https://github.com/ilyam8)) - Create a chart per health check (go.d/consul) ([#899](https://github.com/netdata/go.d.plugin/pull/899), [@ilyam8](https://github.com/ilyam8)) - Add collection of master link status (go.d/redis) ([#856](https://github.com/netdata/go.d.plugin/pull/856), [@ilyam8](https://github.com/ilyam8)) - Add collection of master slave link metrics (go.d/redis) ([#851](https://github.com/netdata/go.d.plugin/pull/851), [@ilyam8](https://github.com/ilyam8)) - Add collection of time elapsed since last RDB save (go.d/redis) ([#850](https://github.com/netdata/go.d.plugin/pull/850), [@ilyam8](https://github.com/ilyam8)) - Add ping latency chart (go.d/redis) ([#849](https://github.com/netdata/go.d.plugin/pull/849), [@ilyam8](https://github.com/ilyam8)) - Check for 'connect' privilege before querying database size (go.d/postgres) ([#845](https://github.com/netdata/go.d.plugin/pull/845), [@ilyam8](https://github.com/ilyam8)) - Allow to set data collection job labels in config (go.d.plugin) ([#840](https://github.com/netdata/go.d.plugin/pull/840), [@ilyam8](https://github.com/ilyam8)) - Improve histogram buckets dimensions (go.d/postgres) ([#833](https://github.com/netdata/go.d.plugin/pull/833), [@ilyam8](https://github.com/ilyam8)) - Add acquired locks utilization chart (go.d/postgres) ([#831](https://github.com/netdata/go.d.plugin/pull/831), [@ilyam8](https://github.com/ilyam8)) - Add _collect_job label (job name) to charts (go.d.plugin) ([#814](https://github.com/netdata/go.d.plugin/pull/814), [@ilyam8](https://github.com/ilyam8)) - Add TCP socket connection support and the state path modification (go.d/phpfpm) ([#805](https://github.com/netdata/go.d.plugin/pull/805), [@vobruba-martin](https://github.com/vobruba-martin)) - Create a dimension for every unit state (go.d/systemdunits) ([#795](https://github.com/netdata/go.d.plugin/pull/795), [@ilyam8](https://github.com/ilyam8)) - Improve Galera state and status charts ([#779](https://github.com/netdata/go.d.plugin/pull/779), [@ilyam8](https://github.com/ilyam8)) - Add discovering dhcp-ranges at runtime (go.d/dnsmasq_dhcp) ([#778](https://github.com/netdata/go.d.plugin/pull/778), [@ilyam8](https://github.com/ilyam8)) - Add collecting image and volume stats (go.d/docker) ([#777](https://github.com/netdata/go.d.plugin/pull/777), [@ilyam8](https://github.com/ilyam8)) - Add Percona MySQL compatibility (go.d/mysql) ([#776](https://github.com/netdata/go.d.plugin/pull/776), [@ilyam8](https://github.com/ilyam8)) - Add collection of additional user statistics metrics ([#775](https://github.com/netdata/go.d.plugin/pull/775), [@ilyam8](https://github.com/ilyam8)) </details> #### Bug fixes <details> <summary>Show 24 more contributions</summary> - Fix eBPF crashes on exit (ebpf.plugin) ([#14012](https://github.com/netdata/netdata/pull/14012), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix not working on Oracle linux (ebpf.plugin) ([#13935](https://github.com/netdata/netdata/pull/13935), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix retry logic when reading network interfaces speed (proc.plugin) ([#13893](https://github.com/netdata/netdata/pull/13893), [@ilyam8](https://github.com/ilyam8)) - Fix systemd chart update (ebpf.plugin) ([#13884](https://github.com/netdata/netdata/pull/13884), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix handling qemu-1- prefix when extracting virsh domain ([#13866](https://github.com/netdata/netdata/pull/13866), [@ilyam8](https://github.com/ilyam8)) - Fix collection of carrier, duplex, and speed metrics when network interface is down (proc.plugin) ([#13850](https://github.com/netdata/netdata/pull/13850), [@vlvkobal](https://github.com/vlvkobal)) - Fix various issues (ebpf.plugin) ([#13624](https://github.com/netdata/netdata/pull/13624), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix apps plugin users charts description (apps.plugin) ([#13621](https://github.com/netdata/netdata/pull/13621), [@ilyam8](https://github.com/ilyam8)) - Fix chart id length check (cgroups.plugin) ([#13601](https://github.com/netdata/netdata/pull/13601), [@ilyam8](https://github.com/ilyam8)) - Fix not respecting update_every for polling (python.d/nvidia_smi) ([#13579](https://github.com/netdata/netdata/pull/13579), [@ilyam8](https://github.com/ilyam8)) - Fix containers name resolution when Docker is a snap package (cgroups.plugin) ([#13523](https://github.com/netdata/netdata/pull/13523), [@ilyam8](https://github.com/ilyam8)) - Fix handling string and float values (go.d/nvme) ([#993](https://github.com/netdata/go.d.plugin/pull/993), [@ilyam8](https://github.com/ilyam8)) - Fix handling ExpirationDate with space (go.d/whoisquery) ([#974](https://github.com/netdata/go.d.plugin/pull/974), [@ilyam8](https://github.com/ilyam8)) - Fix query queryable databases (go.d/postgres) ([#960](https://github.com/netdata/go.d.plugin/pull/960), [@ilyam8](https://github.com/ilyam8)) - Fix not respecting headers config option (go.d/pihole) ([#942](https://github.com/netdata/go.d.plugin/pull/942), [@ilyam8](https://github.com/ilyam8)) - Fix dns_queries_percentage metric calculation (go.d/pihole) ([#922](https://github.com/netdata/go.d.plugin/pull/922), [@ilyam8](https://github.com/ilyam8)) - Fix data collection when auth.bind query is not supported (go.d/dnsmasq) ([#902](https://github.com/netdata/go.d.plugin/pull/902), [@ilyam8](https://github.com/ilyam8)) - Fix data collection when too many db tables and indexes (go.d/postgres) ([#857](https://github.com/netdata/go.d.plugin/pull/857), [@ilyam8](https://github.com/ilyam8)) - Fix creation of bloat charts if no bloat metrics collected (go.d/postgres) ([#846](https://github.com/netdata/go.d.plugin/pull/846), [@ilyam8](https://github.com/ilyam8)) - Fix unregistering connStr at runtime (go.d/postgres) ([#843](https://github.com/netdata/go.d.plugin/pull/843), [@ilyam8](https://github.com/ilyam8)) - Fix bloat size percentage calculation (go.d/postgres) ([#841](https://github.com/netdata/go.d.plugin/pull/841), [@ilyam8](https://github.com/ilyam8)) - Fix charts when binary log and MyISAM are disabled (go.d/mysql) ([#763](https://github.com/netdata/go.d.plugin/pull/763), [@ilyam8](https://github.com/ilyam8)) - Fix data collection jobs cleanup on exit (go.d.plugin) ([#758](https://github.com/netdata/go.d.plugin/pull/758), [@ilyam8](https://github.com/ilyam8)) - Fix handling the case when no images are found (go.d/docker) ([#739](https://github.com/netdata/go.d.plugin/pull/739), [@ilyam8](https://github.com/ilyam8)) </details> #### Other <details> <summary>Show 11 more contributions</summary> - Don't let slow disk plugin thread delay shutdown ([#14044](https://github.com/netdata/netdata/pull/14044), [@MrZammler](https://github.com/MrZammler)) - Remove nginx_plus collector (python.d.plugin) ([#13995](https://github.com/netdata/netdata/pull/13995), [@ilyam8](https://github.com/ilyam8)) - Enable collecting ECC memory errors by default ([#13970](https://github.com/netdata/netdata/pull/13970), [@ilyam8](https://github.com/ilyam8)) - Make Statsd dictionaries multi-threaded ([#13938](https://github.com/netdata/netdata/pull/13938), [@ktsaou](https://github.com/ktsaou)) - Remove NFS readahead histogram (proc.plugin) ([#13819](https://github.com/netdata/netdata/pull/13819), [@vlvkobal](https://github.com/vlvkobal)) - Merge netstat, snmp, and snmp6 modules (proc.plugin) ([#13806](https://github.com/netdata/netdata/pull/13806), [@vlvkobal](https://github.com/vlvkobal)) - Rename dockerd job on lock registration (python.d/dockerd) ([#13537](https://github.com/netdata/netdata/pull/13537), [@ilyam8](https://github.com/ilyam8)) - Remove python.d/* announced in v1.36.0 deprecation notice (python.d.plugin) ([#13503](https://github.com/netdata/netdata/pull/13503), [@ilyam8](https://github.com/ilyam8)) - Remove blocklist file existence state chart (go.d/pihole) ([#914](https://github.com/netdata/go.d.plugin/pull/914), [@ilyam8](https://github.com/ilyam8)) - Remove instance-specific information from chart families (go.d/portcheck) ([#790](https://github.com/netdata/go.d.plugin/pull/790), [@ilyam8](https://github.com/ilyam8)) - Remove spaces in "HTTP Response Time" chart dimensions (go.d/httpcheck) ([#788](https://github.com/netdata/go.d.plugin/pull/788), [@ilyam8](https://github.com/ilyam8)) </details> ### Documentation 📄 Keeping our documentation healthy together with our awesome community. #### Updates <details> <summary>Show 24 more contributions</summary> - Add Alpine 3.17 to supported distros ([#14056](https://github.com/netdata/netdata/pull/14056), [@Ferroin](https://github.com/Ferroin)) - Fix securing streaming communications steps ([#14024](https://github.com/netdata/netdata/pull/14024), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix a typo in Uninstall docs ([#14002](https://github.com/netdata/netdata/pull/14002), [@tkatsoulas](https://github.com/tkatsoulas)) - Use calculator app instead of spreadsheet ([#13981](https://github.com/netdata/netdata/pull/13981), [@andrewm4894](https://github.com/andrewm4894)) - Document password param for tor collector ([#13966](https://github.com/netdata/netdata/pull/13966), [@andrewm4894](https://github.com/andrewm4894)) - Reference the bash collector for RPi ([#13907](https://github.com/netdata/netdata/pull/13907), [@cakrit](https://github.com/cakrit)) - Improve intro paragraph for sensors collector ([#13906](https://github.com/netdata/netdata/pull/13906), [@cakrit](https://github.com/cakrit)) - Add pandas collector to collectors.md ([#13895](https://github.com/netdata/netdata/pull/13895), [@andrewm4894](https://github.com/andrewm4894)) - Update dbengine options in step-09.md ([#13864](https://github.com/netdata/netdata/pull/13864), [@DShreve2](https://github.com/DShreve2)) - Fix a typo in pandas collector readme ([#13853](https://github.com/netdata/netdata/pull/13853), [@andrewm4894](https://github.com/andrewm4894)) - Add up-to-date info on improving performance ([#13801](https://github.com/netdata/netdata/pull/13801), [@cakrit](https://github.com/cakrit)) - Update fping plugin documentation with better details about the required version ([#13765](https://github.com/netdata/netdata/pull/13765), [@Ferroin](https://github.com/Ferroin)) - Provide details on label filtering/custom labels ([#13745](https://github.com/netdata/netdata/pull/13745), [@DShreve2](https://github.com/DShreve2)) - Add a note that nvidia-smi does not work inside a container ([#13695](https://github.com/netdata/netdata/pull/13695), [@ilyam8](https://github.com/ilyam8)) - Add info for Docker containers about using hostname from host ([#13685](https://github.com/netdata/netdata/pull/13685), [@Ferroin](https://github.com/Ferroin)) - Update dictionary documentation ([#13679](https://github.com/netdata/netdata/pull/13679), [@ktsaou](https://github.com/ktsaou)) - Update uninstaller documentation ([#13627](https://github.com/netdata/netdata/pull/13627), [@Ferroin](https://github.com/Ferroin)) - Add link to the performance optimization guide ([#13595](https://github.com/netdata/netdata/pull/13595), [@cakrit](https://github.com/cakrit)) - Update macOS community support details ([#13536](https://github.com/netdata/netdata/pull/13536), [@DShreve2](https://github.com/DShreve2)) - Update FreeIPMI and CUPS plugin documentation ([#13526](https://github.com/netdata/netdata/pull/13526), [@Ferroin](https://github.com/Ferroin)) - Remove reference to charts now in netdata monitoring ([#13521](https://github.com/netdata/netdata/pull/13521), [@andrewm4894](https://github.com/andrewm4894)) - Add a note about authorized_mailq_users to postfix readme ([#13515](https://github.com/netdata/netdata/pull/13515), [@ilyam8](https://github.com/ilyam8)) - Add a document outlining how to build native packages locally ([#12431](https://github.com/netdata/netdata/pull/12431), [@Ferroin](https://github.com/Ferroin)) - Add some tips on collecting per-queue metrics for RabbitMQ ([#12227](https://github.com/netdata/netdata/pull/12227), [@HG00](https://github.com/HG00)) </details> ### Health #### Engine - Add support of chart labels in alerts ([#13290](https://github.com/netdata/netdata/pull/13290), [@MrZammler](https://github.com/MrZammler)) #### Notifications - Add an option to retry on telegram API limit error ([#13119](https://github.com/netdata/netdata/pull/13119), [@MAH69IK](https://github.com/MAH69IK)) - Set default curl connection timeout if not set ([#13529](https://github.com/netdata/netdata/pull/13529), [@ilyam8](https://github.com/ilyam8)) #### Alarms <details> <summary>Show 12 more contributions</summary> - Use 'host' label in alerts info (health.d/ping.conf) ([#13955](https://github.com/netdata/netdata/pull/13955), [@ilyam8](https://github.com/ilyam8)) - Remove pihole_blocklist_gravity_file_existence_state (health.d/pihole.conf) ([#13826](https://github.com/netdata/netdata/pull/13826), [@ilyam8](https://github.com/ilyam8)) - Fix the systemd_mount_unit_failed_state alarm name (health.d/systemdunits.conf) ([#13796](https://github.com/netdata/netdata/pull/13796), [@tkatsoulas](https://github.com/tkatsoulas)) - Add 1m delay for tcp reset alarms (health.d/tcp_resets.conf) ([#13761](https://github.com/netdata/netdata/pull/13761), [@ilyam8](https://github.com/ilyam8)) - Add new Redis alarms (health.d/redis.conf) ([#13715](https://github.com/netdata/netdata/pull/13715), [@ilyam8](https://github.com/ilyam8)) - Fix inconsistent alert class names ([#13699](https://github.com/netdata/netdata/pull/13699), [@ralphm](https://github.com/ralphm)) - Disable Postgres last vacuum/analyze alarms (health.d/postgres.conf) ([#13698](https://github.com/netdata/netdata/pull/13698), [@ilyam8](https://github.com/ilyam8)) - Add node level AR based example (health.d/ml.conf) ([#13684](https://github.com/netdata/netdata/pull/13684), [@andrewm4894](https://github.com/andrewm4894)) - Add Postgres alarms (health.d/postgres.conf) ([#13671](https://github.com/netdata/netdata/pull/13671), [@ilyam8](https://github.com/ilyam8)) - Adjust systemdunits alarms (health.d/systemdunits.conf) ([#13623](https://github.com/netdata/netdata/pull/13623), [@ilyam8](https://github.com/ilyam8)) - Add Postgres total connection utilization alarm (health.d/postgres.conf) ([#13620](https://github.com/netdata/netdata/pull/13620), [@ilyam8](https://github.com/ilyam8)) - Adjust mysql_galera_cluster_size_max_2m lookup to make time in warn/crit predictable (health.d/mysql.conf) ([#13563](https://github.com/netdata/netdata/pull/13563), [@ilyam8](https://github.com/ilyam8)) </details> ### Packaging / Installation #### Changes <details> <summary>Show 28 more contributions</summary> - Fix writing to stdout if static update is successful ([#14058](https://github.com/netdata/netdata/pull/14058), [@ilyam8](https://github.com/ilyam8)) - Update go.d.plugin to v0.45.0 ([#14052](https://github.com/netdata/netdata/pull/14052), [@ilyam8](https://github.com/ilyam8)) - Provide improved messaging in the kickstart script for existing installs managed by the system package manager ([#13947](https://github.com/netdata/netdata/pull/13947), [@Ferroin](https://github.com/Ferroin)) - Add CAP_NET_RAW to go.d.plugin ([#13909](https://github.com/netdata/netdata/pull/13909), [@ilyam8](https://github.com/ilyam8)) - Record installation command in telemetry events ([#13892](https://github.com/netdata/netdata/pull/13892), [@Ferroin](https://github.com/Ferroin)) - Overhaul generation of distinct-ids for install telemetry events ([#13891](https://github.com/netdata/netdata/pull/13891), [@Ferroin](https://github.com/Ferroin)) - Prompt users about updates/claiming on unknown install types ([#13890](https://github.com/netdata/netdata/pull/13890), [@Ferroin](https://github.com/Ferroin)) - Fix duplicate error code in kickstart.sh ([#13887](https://github.com/netdata/netdata/pull/13887), [@Ferroin](https://github.com/Ferroin)) - Properly guard commands when installing services for offline service managers ([#13848](https://github.com/netdata/netdata/pull/13848), [@Ferroin](https://github.com/Ferroin)) - Fix service installation on FreeBSD. ([#13842](https://github.com/netdata/netdata/pull/13842), [@Ferroin](https://github.com/Ferroin)) - Improve error and warning messages in the kickstart script ([#13825](https://github.com/netdata/netdata/pull/13825), [@Ferroin](https://github.com/Ferroin)) - Properly propagate errors from installer/updater to kickstart script ([#13802](https://github.com/netdata/netdata/pull/13802), [@Ferroin](https://github.com/Ferroin)) - Fix runtime directory ownership when installed as non-root user ([#13797](https://github.com/netdata/netdata/pull/13797), [@Ferroin](https://github.com/Ferroin)) - Stop pulling in netcat as a mandatory dependency ([#13787](https://github.com/netdata/netdata/pull/13787), [@Ferroin](https://github.com/Ferroin)) - Add Ubuntu 22.10 to supported distros, CI, and package builds ([#13785](https://github.com/netdata/netdata/pull/13785), [@Ferroin](https://github.com/Ferroin)) - Allow netdata installer to install and run netdata as any user ([#13780](https://github.com/netdata/netdata/pull/13780), [@ktsaou](https://github.com/ktsaou)) - Update libbpf to v1.0.1 ([#13778](https://github.com/netdata/netdata/pull/13778), [@thiagoftsm](https://github.com/thiagoftsm)) - Further improvements to the new service installation code ([#13774](https://github.com/netdata/netdata/pull/13774), [@Ferroin](https://github.com/Ferroin)) - Use /bin/sh instead of ls to detect glibc ([#13758](https://github.com/netdata/netdata/pull/13758), [@MrZammler](https://github.com/MrZammler)) - Add CloudLinux OS detection to the updater script ([#13752](https://github.com/netdata/netdata/pull/13752), [@Pulseeey](https://github.com/Pulseeey)) - Add CloudLinux OS detection to kickstart ([#13750](https://github.com/netdata/netdata/pull/13750), [@Pulseeey](https://github.com/Pulseeey)) - Fix handling of temporary directories in kickstart code. ([#13744](https://github.com/netdata/netdata/pull/13744), [@Ferroin](https://github.com/Ferroin)) - Fix a typo in netdata-installer.sh ([#13514](https://github.com/netdata/netdata/pull/13514), [@uplime](https://github.com/uplime)) - Add CAP_NET_ADMIN for go.d.plugin ([#13507](https://github.com/netdata/netdata/pull/13507), [@ilyam8](https://github.com/ilyam8)) - Update PIDFile in netdata.service to avoid systemd legacy path warning ([#13504](https://github.com/netdata/netdata/pull/13504), [@candrews](https://github.com/candrews)) - Overhaul handling of installation of Netdata as a system service. ([#13451](https://github.com/netdata/netdata/pull/13451), [@Ferroin](https://github.com/Ferroin)) - Fix existing install detection for FreeBSD and macOS ([#13243](https://github.com/netdata/netdata/pull/13243), [@Ferroin](https://github.com/Ferroin)) - Assorted cleanup in the OpenRC init script ([#13115](https://github.com/netdata/netdata/pull/13115), [@Ferroin](https://github.com/Ferroin)) </details> ### Other Notable Changes ⚙️ Greasing the gears to smooth your experience with Netdata. #### Improvements <details> <summary>Show 9 more contributions</summary> - Add replication of metrics (gaps filling) during streaming ([#13873](https://github.com/netdata/netdata/pull/13873), [@vkalintiris](https://github.com/vkalintiris)) - Remove anomaly rates chart ([#13763](https://github.com/netdata/netdata/pull/13763), [@vkalintiris](https://github.com/vkalintiris)) - Add disabling netdata monitoring section of the dashboard ([#13788](https://github.com/netdata/netdata/pull/13788), [@ktsaou](https://github.com/ktsaou)) - Add host labels for ephemerality and nodes with unstable connections ([#13784](https://github.com/netdata/netdata/pull/13784), [@underhood](https://github.com/underhood)) - Allow netdata plugins to expose functions for querying more information about specific charts ([#13720](https://github.com/netdata/netdata/pull/13720), [@ktsaou](https://github.com/ktsaou)) - Improve Health engine performance by adding a thread per host ([#13712](https://github.com/netdata/netdata/pull/13712), [@MrZammler](https://github.com/MrZammler)) - Improve streaming performance by 25% on the child ([#13708](https://github.com/netdata/netdata/pull/13708), [@ktsaou](https://github.com/ktsaou)) - Improve agent shutdown time ([#13649](https://github.com/netdata/netdata/pull/13649), [@stelfrag](https://github.com/stelfrag)) - Add disabling Cloud functionality via NETDATA_DISABLE_CLOUD environment variable ([#13106](https://github.com/netdata/netdata/pull/13106), [@ilyam8](https://github.com/ilyam8)) -</details> #### Bug Fixes 🐞 Increasing Netdata's reliability, one bug fix at a time. <details> <summary>Show 46 more contributions</summary> - Fix sanitizing command arguments executed by the health component ([#14064](https://github.com/netdata/netdata/pull/14064), [@vkalintiris](https://github.com/vkalintiris)) - Fix control of streaming API keys and MACHINE GUIDs in stream.conf ([#14063](https://github.com/netdata/netdata/pull/14063), [@ktsaou](https://github.com/ktsaou)) - Fix build on old versions of openssl on Centos ([#14045](https://github.com/netdata/netdata/pull/14045), [@underhood](https://github.com/underhood)) - Fix merging duplicate replication requests ([#14037](https://github.com/netdata/netdata/pull/14037), [@ktsaou](https://github.com/ktsaou)) - Fix various problems in streaming compression, query planner and replication ([#14023](https://github.com/netdata/netdata/pull/14023), [@ktsaou](https://github.com/ktsaou)) - Fix ACLK connection resets on parents with a lot of children ([#14004](https://github.com/netdata/netdata/pull/14004), [@underhood](https://github.com/underhood)) - Fix crash when netdata cannot execute its external plugins ([#13978](https://github.com/netdata/netdata/pull/13978), [@ktsaou](https://github.com/ktsaou)) - Fix metrics suffix for counters when using remote write exporter ([#13977](https://github.com/netdata/netdata/pull/13977), [@vlvkobal](https://github.com/vlvkobal)) - Fix replicating non-existing child host ([#13968](https://github.com/netdata/netdata/pull/13968), [@ktsaou](https://github.com/ktsaou)) - Fix local dashboard cloud links ([#13953](https://github.com/netdata/netdata/pull/13953), [@underhood](https://github.com/underhood)) - Fix stopping Netdata on WSL1 ([#13948](https://github.com/netdata/netdata/pull/13948), [@MrZammler](https://github.com/MrZammler)) - Fix negative values when removing a "percentage-of-incremental-row" dimension ([#13945](https://github.com/netdata/netdata/pull/13945), [@ktsaou](https://github.com/ktsaou)) - Fix chart definition end time_t printing and parsing ([#13942](https://github.com/netdata/netdata/pull/13942), [@ktsaou](https://github.com/ktsaou)) - Fix not using system CA certificates when streaming ([#13941](https://github.com/netdata/netdata/pull/13941), [@MrZammler](https://github.com/MrZammler)) - Fix segfault when a dimension is deleted while replicated ([#13932](https://github.com/netdata/netdata/pull/13932), [@ktsaou](https://github.com/ktsaou)) - Fix compiling without dbengine ([#13931](https://github.com/netdata/netdata/pull/13931), [@ilyam8](https://github.com/ilyam8)) - Fix crash on query plan switch ([#13920](https://github.com/netdata/netdata/pull/13920), [@ktsaou](https://github.com/ktsaou)) - Fix crash when free hosts if a change on db mode is not needed ([#13912](https://github.com/netdata/netdata/pull/13912), [@ktsaou](https://github.com/ktsaou)) - Fix timeframe matching in query engine ([#13911](https://github.com/netdata/netdata/pull/13911), [@ktsaou](https://github.com/ktsaou)) - Fix reading health "enable" from the configuration ([#13894](https://github.com/netdata/netdata/pull/13894), [@stelfrag](https://github.com/stelfrag)) - Fix segmentation fault on 32-bit RPi ([#13876](https://github.com/netdata/netdata/pull/13876), [@MrZammler](https://github.com/MrZammler)) - Fix ml_info call via ACLK ([#13863](https://github.com/netdata/netdata/pull/13863), [@underhood](https://github.com/underhood)) - Fix compiling with LTO enabled on FreeBSD ([#13854](https://github.com/netdata/netdata/pull/13854), [@MrZammler](https://github.com/MrZammler)) - Fix tiers update frequency ([#13844](https://github.com/netdata/netdata/pull/13844), [@ktsaou](https://github.com/ktsaou)) - Fix crash on child reconnect and lost metrics ([#13821](https://github.com/netdata/netdata/pull/13821), [@stelfrag](https://github.com/stelfrag)) - Fix post-processing of contexts ([#13807](https://github.com/netdata/netdata/pull/13807), [@ktsaou](https://github.com/ktsaou)) - Fix initialization of chart variables ([#13795](https://github.com/netdata/netdata/pull/13795), [@MrZammler](https://github.com/MrZammler)) - Fix Array Allocator memory leak([#13792](https://github.com/netdata/netdata/pull/13792), [@ktsaou](https://github.com/ktsaou)) - Fix chart variables initialization ([#13786](https://github.com/netdata/netdata/pull/13786), [@MrZammler](https://github.com/MrZammler)) - Fix compilation on CentOS 7.9 ([#13775](https://github.com/netdata/netdata/pull/13775), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix count of currently streaming senders on the localhost ([#13755](https://github.com/netdata/netdata/pull/13755), [@MrZammler](https://github.com/MrZammler)) - Fix streaming crash when child reconnects and is archived on the parent ([#13754](https://github.com/netdata/netdata/pull/13754), [@stelfrag](https://github.com/stelfrag)) - Fix sending NodeInfo during first database cleanup ([#13740](https://github.com/netdata/netdata/pull/13740), [@MrZammler](https://github.com/MrZammler)) - Fix starting an archived host in dbengine if dbengine is not compiled ([#13724](https://github.com/netdata/netdata/pull/13724), [@stelfrag](https://github.com/stelfrag)) - Fix building judy without dbengine ([#13703](https://github.com/netdata/netdata/pull/13703), [@underhood](https://github.com/underhood)) - fix typo not deleting collected flag; force removing collected flag on child disconnect ([#13672](https://github.com/netdata/netdata/pull/13672), [@ktsaou](https://github.com/ktsaou)) - Fix access to old data when nmap is used ([#13666](https://github.com/netdata/netdata/pull/13666), [@stelfrag](https://github.com/stelfrag)) - Fix container virtualization info detection ([#13653](https://github.com/netdata/netdata/pull/13653), [@vlvkobal](https://github.com/vlvkobal)) - Fix rrdcontexts left in the post-processing queue from the garbage collector ([#13645](https://github.com/netdata/netdata/pull/13645), [@ktsaou](https://github.com/ktsaou)) - Fix a memory leak on archived host creation ([#13641](https://github.com/netdata/netdata/pull/13641), [@stelfrag](https://github.com/stelfrag)) - Fix worker utilization cleanup ([#13633](https://github.com/netdata/netdata/pull/13633), [@stelfrag](https://github.com/stelfrag)) - Fix loading db rows when chart_id or dim_id is null ([#13608](https://github.com/netdata/netdata/pull/13608), [@MrZammler](https://github.com/MrZammler)) - Fix crash on rrdcontext apis when rrdcontexts is not initialized ([#13578](https://github.com/netdata/netdata/pull/13578), [@ktsaou](https://github.com/ktsaou)) - Fix a failure to build eBPF with CMake ([#13568](https://github.com/netdata/netdata/pull/13568), [@underhood](https://github.com/underhood)) - Fix a crash when xen libraries are misconfigured ([#13535](https://github.com/netdata/netdata/pull/13535), [@vlvkobal](https://github.com/vlvkobal)) - Fix crashes on 32bit system ([#13511](https://github.com/netdata/netdata/pull/13511), [@MrZammler](https://github.com/MrZammler)) </details> ### Code organization #### Changes <details> <summary>Show 92 more contributions</summary> - Replication fixes 8 ([#14061](https://github.com/netdata/netdata/pull/14061), [@ktsaou](https://github.com/ktsaou)) - Replication fixes 7 ([#14053](https://github.com/netdata/netdata/pull/14053), [@ktsaou](https://github.com/ktsaou)) - Remove eBPF plugin warning ([#14047](https://github.com/netdata/netdata/pull/14047), [@thiagoftsm](https://github.com/thiagoftsm)) - Replication fixes 6 ([#14046](https://github.com/netdata/netdata/pull/14046), [@ktsaou](https://github.com/ktsaou)) - Fix dictionaries unittest ([#14042](https://github.com/netdata/netdata/pull/14042), [@ktsaou](https://github.com/ktsaou)) - Improve log message in case of ACLK SSL error ([#14041](https://github.com/netdata/netdata/pull/14041), [@underhood](https://github.com/underhood)) - Replication fixes 5 ([#14038](https://github.com/netdata/netdata/pull/14038), [@ktsaou](https://github.com/ktsaou)) - Replication fixes 3 ([#14035](https://github.com/netdata/netdata/pull/14035), [@ktsaou](https://github.com/ktsaou)) - Improve performance of worker utilization statistics ([#14034](https://github.com/netdata/netdata/pull/14034), [@ktsaou](https://github.com/ktsaou)) - Use 2 levels of judy arrays to speed up replication on very busy parents ([#14031](https://github.com/netdata/netdata/pull/14031), [@ktsaou](https://github.com/ktsaou)) - Remove retries from SSL ([#14026](https://github.com/netdata/netdata/pull/14026), [@ktsaou](https://github.com/ktsaou)) - Silence misleading error on ACLK startup ([#14013](https://github.com/netdata/netdata/pull/14013), [@underhood](https://github.com/underhood)) - Change static image urls to app.netdata.cloud in alarm-notify.sh ([#14007](https://github.com/netdata/netdata/pull/14007), [@MrZammler](https://github.com/MrZammler)) - Fix MQTT-NG QoS0 ([#13997](https://github.com/netdata/netdata/pull/13997), [@underhood](https://github.com/underhood)) - Add 'funcs' capability ([#13992](https://github.com/netdata/netdata/pull/13992), [@underhood](https://github.com/underhood)) - Add debug info on left-over query targets ([#13990](https://github.com/netdata/netdata/pull/13990), [@ktsaou](https://github.com/ktsaou)) - Replication improvements ([#13989](https://github.com/netdata/netdata/pull/13989), [@ktsaou](https://github.com/ktsaou)) - Remove spaces from keys in processes output of apps.plugin functions ([#13980](https://github.com/netdata/netdata/pull/13980), [@ktsaou](https://github.com/ktsaou)) - Prohibit using spaces in apps.plugin function processes keys ([#13980](https://github.com/netdata/netdata/pull/13980), [@ktsaou](https://github.com/ktsaou)) - Fallback to ar and ranlib if llvm-ar and llvm-ranlib are not there ([#13959](https://github.com/netdata/netdata/pull/13959), [@MrZammler](https://github.com/MrZammler)) - Require -DENABLE_DLSYM=1 to use dlsym() ([#13958](https://github.com/netdata/netdata/pull/13958), [@ktsaou](https://github.com/ktsaou)) - Do not resend charts upstream when chart variables are being updated ([#13946](https://github.com/netdata/netdata/pull/13946), [@ktsaou](https://github.com/ktsaou)) - Update print message on startup ([#13934](https://github.com/netdata/netdata/pull/13934), [@andrewm4894](https://github.com/andrewm4894)) - Remove pluginsd action param & dead code ([#13928](https://github.com/netdata/netdata/pull/13928), [@vkalintiris](https://github.com/vkalintiris)) - Do not force internal collectors to call rrdset_next ([#13926](https://github.com/netdata/netdata/pull/13926), [@vkalintiris](https://github.com/vkalintiris)) - Return accidentaly removed 32bit RPi keep alive fix ([#13925](https://github.com/netdata/netdata/pull/13925), [@underhood](https://github.com/underhood)) - Add error_limit() function to limit number of error lines per instance ([#13924](https://github.com/netdata/netdata/pull/13924), [@ktsaou](https://github.com/ktsaou)) - Enable aclk conversation log even without NETDATA_INTERNAL CHECKS ([#13917](https://github.com/netdata/netdata/pull/13917), [@MrZammler](https://github.com/MrZammler)) - Add max value on all value columns for apps.plugin function ([#13899](https://github.com/netdata/netdata/pull/13899), [@ktsaou](https://github.com/ktsaou)) - Reduce unnecessary alert events to the cloud ([#13897](https://github.com/netdata/netdata/pull/13897), [@MrZammler](https://github.com/MrZammler)) - Tune rrdcontext timings ([#13889](https://github.com/netdata/netdata/pull/13889), [@ktsaou](https://github.com/ktsaou)) - Add filtering charts in context queries, includes them in full_xxx variables ([#13886](https://github.com/netdata/netdata/pull/13886), [@ktsaou](https://github.com/ktsaou)) - Cosmetic changes for apps.plugin function processes ([#13880](https://github.com/netdata/netdata/pull/13880), [@ktsaou](https://github.com/ktsaou)) - Allow single chart to be filtered in context queries ([#13879](https://github.com/netdata/netdata/pull/13879), [@ktsaou](https://github.com/ktsaou)) - Suppress ML and dlib ABI warnings ([#13875](https://github.com/netdata/netdata/pull/13875), [@Dim-P](https://github.com/Dim-P)) - Don't create a REMOVED alert event after a REMOVED ([#13871](https://github.com/netdata/netdata/pull/13871), [@MrZammler](https://github.com/MrZammler)) - Store hidden status when creating / updating dimension metadata ([#13869](https://github.com/netdata/netdata/pull/13869), [@stelfrag](https://github.com/stelfrag)) - Find the chart and dimension UUID from the context ([#13868](https://github.com/netdata/netdata/pull/13868), [@stelfrag](https://github.com/stelfrag)) - Change all api accesses to api.netdata.cloud ([#13856](https://github.com/netdata/netdata/pull/13856), [@underhood](https://github.com/underhood)) - Use mmap to read an extent from a datafile ([#13834](https://github.com/netdata/netdata/pull/13834), [@stelfrag](https://github.com/stelfrag)) - Remove option to use MQTT 3 ([#13824](https://github.com/netdata/netdata/pull/13824), [@underhood](https://github.com/underhood)) - Extended processes function info from apps.plugin ([#13822](https://github.com/netdata/netdata/pull/13822), [@ktsaou](https://github.com/ktsaou)) - Add trace alloc to buildinfo ([#13817](https://github.com/netdata/netdata/pull/13817), [@underhood](https://github.com/underhood)) - Inject costallocz to mqtt_websockets library and its children ([#13813](https://github.com/netdata/netdata/pull/13813), [@underhood](https://github.com/underhood)) - Overload libc memory allocators with custom ones to trace all allocations ([#13810](https://github.com/netdata/netdata/pull/13810), [@ktsaou](https://github.com/ktsaou)) - Fix warning when -Wfree-nonheap-object is used ([#13805](https://github.com/netdata/netdata/pull/13805), [@underhood](https://github.com/underhood)) - Optimize ARAL alloc size ([#13804](https://github.com/netdata/netdata/pull/13804), [@ktsaou](https://github.com/ktsaou)) - Add internal log error, when passing NULL dictionary ([#13803](https://github.com/netdata/netdata/pull/13803), [@ktsaou](https://github.com/ktsaou)) - Return memory freed properly ([#13799](https://github.com/netdata/netdata/pull/13799), [@stelfrag](https://github.com/stelfrag)) - Use string_freez instead of freez in rrdhost_init_timezone ([#13798](https://github.com/netdata/netdata/pull/13798), [@MrZammler](https://github.com/MrZammler)) - Add variants of functions allowing callers to specify the time to use. ([#13791](https://github.com/netdata/netdata/pull/13791), [@vkalintiris](https://github.com/vkalintiris)) - Remove extern from function declared in headers. ([#13790](https://github.com/netdata/netdata/pull/13790), [@vkalintiris](https://github.com/vkalintiris)) - Full memory tracking and profiling of Netdata Agent ([#13789](https://github.com/netdata/netdata/pull/13789), [@ktsaou](https://github.com/ktsaou)) - Add a thread to asynchronously process metadata updates ([#13783](https://github.com/netdata/netdata/pull/13783), [@stelfrag](https://github.com/stelfrag)) - Parser cleanup ([#13782](https://github.com/netdata/netdata/pull/13782), [@stelfrag](https://github.com/stelfrag)) - Bump websockets submodule ([#13776](https://github.com/netdata/netdata/pull/13776), [@underhood](https://github.com/underhood)) - Make dbengine free from RRDSET and RRDDIM ([#13772](https://github.com/netdata/netdata/pull/13772), [@ktsaou](https://github.com/ktsaou)) - Add possibility to build without ACLK with CMake ([#13736](https://github.com/netdata/netdata/pull/13736), [@underhood](https://github.com/underhood)) - Fix coverity warnings ([#13735](https://github.com/netdata/netdata/pull/13735), [@thiagoftsm](https://github.com/thiagoftsm)) - Do not create train/predict dimensions meant for tracking anomaly rates. ([#13707](https://github.com/netdata/netdata/pull/13707), [@vkalintiris](https://github.com/vkalintiris)) - Update exporting unit tests ([#13706](https://github.com/netdata/netdata/pull/13706), [@vlvkobal](https://github.com/vlvkobal)) - Add new query engine for Netdata Agent (QUERY_TARGET) ([#13697](https://github.com/netdata/netdata/pull/13697), [@ktsaou](https://github.com/ktsaou)) - Use CMake generated config.h also in out of tree CMake build ([#13692](https://github.com/netdata/netdata/pull/13692), [@underhood](https://github.com/underhood)) - Store nulls instead of empty strings in health tables ([#13683](https://github.com/netdata/netdata/pull/13683), [@MrZammler](https://github.com/MrZammler)) - Fix warnings during compilation time on ARM (32 bits) ([#13681](https://github.com/netdata/netdata/pull/13681), [@thiagoftsm](https://github.com/thiagoftsm)) - Disable internal log ([#13678](https://github.com/netdata/netdata/pull/13678), [@ktsaou](https://github.com/ktsaou)) - Remove _instance_family label ([#13674](https://github.com/netdata/netdata/pull/13674), [@ilyam8](https://github.com/ilyam8)) - Additional sqlite statistics ([#13668](https://github.com/netdata/netdata/pull/13668), [@stelfrag](https://github.com/stelfrag)) - Add sqlite page cache hits and miss statistics ([#13665](https://github.com/netdata/netdata/pull/13665), [@stelfrag](https://github.com/stelfrag)) - Use mmap if possible during startup for journal replay ([#13660](https://github.com/netdata/netdata/pull/13660), [@stelfrag](https://github.com/stelfrag)) - Remove anomaly detector ([#13657](https://github.com/netdata/netdata/pull/13657), [@vkalintiris](https://github.com/vkalintiris)) - Do not free AR dimensions from within ML. ([#13651](https://github.com/netdata/netdata/pull/13651), [@vkalintiris](https://github.com/vkalintiris)) - Remove Chart/Dim based communication ([#13650](https://github.com/netdata/netdata/pull/13650), [@underhood](https://github.com/underhood)) - Add RRD structures managed by dictionaries ([#13646](https://github.com/netdata/netdata/pull/13646), [@ktsaou](https://github.com/ktsaou)) - Fix compilation issues ([#13640](https://github.com/netdata/netdata/pull/13640), [@ktsaou](https://github.com/ktsaou)) - Obsolete RRDSET state ([#13635](https://github.com/netdata/netdata/pull/13635), [@ktsaou](https://github.com/ktsaou)) - Remove forgotten avl structure from rrdcalc ([#13632](https://github.com/netdata/netdata/pull/13632), [@ktsaou](https://github.com/ktsaou)) - Improve rrdcontext performance ([#13629](https://github.com/netdata/netdata/pull/13629), [@ktsaou](https://github.com/ktsaou)) - Clean chart hash map ([#13611](https://github.com/netdata/netdata/pull/13611), [@stelfrag](https://github.com/stelfrag)) - Use prepared statements for context related queries ([#13602](https://github.com/netdata/netdata/pull/13602), [@stelfrag](https://github.com/stelfrag)) - Add sqlite3 global statistics ([#13594](https://github.com/netdata/netdata/pull/13594), [@ktsaou](https://github.com/ktsaou)) - Various CMake improvements ([#13575](https://github.com/netdata/netdata/pull/13575), [@underhood](https://github.com/underhood)) - Deduplicate all netdata strings ([#13570](https://github.com/netdata/netdata/pull/13570), [@ktsaou](https://github.com/ktsaou)) - Removing logging that a chart collection in the same interpolation point ([#13567](https://github.com/netdata/netdata/pull/13567), [@ilyam8](https://github.com/ilyam8)) - Prefer context attributes from non archived charts ([#13559](https://github.com/netdata/netdata/pull/13559), [@MrZammler](https://github.com/MrZammler)) - Fix coverity 380387 ([#13551](https://github.com/netdata/netdata/pull/13551), [@MrZammler](https://github.com/MrZammler)) - Remove aclk_api.ch ([#13540](https://github.com/netdata/netdata/pull/13540), [@underhood](https://github.com/underhood)) - Cleanup of APIs ([#13539](https://github.com/netdata/netdata/pull/13539), [@underhood](https://github.com/underhood)) - Schedule next rotation based on absolute time ([#13531](https://github.com/netdata/netdata/pull/13531), [@MrZammler](https://github.com/MrZammler)) - Calculate name hash after rrdvar_fix_name ([#13509](https://github.com/netdata/netdata/pull/13509), [@MrZammler](https://github.com/MrZammler)) - Reduce memcpy and memory usage on mqtt5 ([#13450](https://github.com/netdata/netdata/pull/13450), [@underhood](https://github.com/underhood)) - Specify paths to source files for out-of-tree build ([#11475](https://github.com/netdata/netdata/pull/11475), [@KickerTom](https://github.com/KickerTom)) </details> ## Deprecation and product notices <a id="v1370-deprecation"></a> ### Forthcoming deprecation notice The following items will be removed in our next minor release (v1.38.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |--------------------------------------------------------------------------------------------------------|:---------:|:----------------------------------------------------------------------------------:| | [python.d/dockerd](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/dockerd) | collector | [go.d/docker](https://github.com/netdata/go.d.plugin/tree/master/modules/docker) | | [python.d/logind](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/logind) | collector | [go.d/logind](https://github.com/netdata/go.d.plugin/tree/master/modules/logind) | | [python.d/mongodb](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/mongodb) | collector | [go.d/mongodb](https://github.com/netdata/go.d.plugin/tree/master/modules/mongodb) | | [fping](https://github.com/netdata/netdata/tree/master/collectors/fping.plugin) | collector | [go.d/ping](https://github.com/netdata/go.d.plugin/tree/master/modules/ping) | All the deprecated components will be moved to the [netdata/community](https://github.com/netdata/community) repository. ### Deprecated in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.36.0#v1360-deprecation-notice), the following items have been removed in this release: | Component | Type | Replaced by | |----------------------------------------------------------------------------------------------------------|:---------:|:------------------------------------------------------------------------------------:| | [python.d/postgres](https://github.com/netdata/netdata/tree/v1.36.0/collectors/python.d.plugin/postgres) | collector | [go.d/postgres](https://github.com/netdata/go.d.plugin/tree/master/modules/postgres) | ### Notable changes and suggested actions #### Kickstart unrecognized option error In an effort to improve our kickstart script even more, documented [here](https://github.com/netdata/netdata/issues/12630) and [here](https://github.com/netdata/netdata/pull/12943), a change will be made in the next major release that will result in users receiving an error if they pass an unrecognized option, rather than allowing them to pass through the installer code. #### New documentation structure In the coming weeks, we will be introducing a new structure to [Netdata Learn](https://learn.netdata.cloud). Part of this effort includes having healthy redirects, instructions, and landing pages to minimize confusion and lost bookmarks, but users may still encounter broken links or errors when loading moved or deleted pages. Users can feel free to submit a [Github Issues](https://github.com/netdata/netdata/issues) if they encounter such a problem, or reach out to the [Netdata Documentation Team](Documentation@netdata.cloud) with questions or ideas on how our docs can best serve you. #### External plugin packaging (Possible action required) In a forthcoming release, many external plugins will be moved to their own packages in our native packages to allow enhanced control over what plugins you have installed, to preserve bandwidth when updating, and to avoid some potentially undesirable dependencies. As a result of this, at some point during the lead-up to the next minor release, the following plugins will no longer be installed by default on systems using native packages, and users with any of these plugins on an existing install will need to manually install the packages in order to continue using them: - nfacct - ioping - slabinfo - perf - charts.d **Note**: Static builds and locally built installations are unaffected. Netdata will provide more details once the changes go live. ## Netdata Release Meetup <a id="v1370-release-meetup"></a> <!-- Remove if no meetup will take place --> Join the Netdata team on the 1st of December, at 5PM UTC, for the **Netdata Release Meetup**, which will be held on the [Netdata Discord](https://discord.gg/CTju7msd?event=1047466036389216256). Together we’ll cover: - Release Highlights - Acknowledgements - Q&A with the community [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/290057165/) - we look forward to meeting you. ## Support options <a id="v1370-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in Netdata, feel free to contact us through one of the following channels: [Netdata Learn](https://learn.netdata.cloud/): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1300 engineers are already using it! 2022-11-30T18:12:17+00:00 netdata v1.37.1 netdata v1.37.1 2022-12-05T16:07:08+00:00 Netdata v1.37.1 is a patch release to address issues discovered since v1.37.0. Refer to the [v.1.37.0 release notes](https://github.com/netdata/netdata/releases/tag/v1.37.0) for the full scope of that release. The v1.37.1 patch release fixes the following issues: - Parent agent crash when many children instances (re)connect at the same time, causing simultaneous SSL re-initialization ([PR #14076](https://github.com/netdata/netdata/pull/14076)). - Agent crash during dbengine database file rotation while a page is being read while being deleted ([PR #14081](https://github.com/netdata/netdata/pull/14081)). - Agent crash on metrics page alignment when metrics were stopped being collected for a long time and then started again ([PR #14086](https://github.com/netdata/netdata/pull/14086)). - Broken Fedora native packages ([PR #14082](https://github.com/netdata/netdata/pull/14082)). - Fix dbengine backfilling statistics ([PR #14074](https://github.com/netdata/netdata/pull/14074)). In addition, the release contains the following optimizations and improvements: - Improve workers statistics performance ([PR #14077](https://github.com/netdata/netdata/pull/14077)). - Improve replication performance ([PR #14079](https://github.com/netdata/netdata/pull/14079)). - Optimize dictionaries ([PR #14085](https://github.com/netdata/netdata/pull/14085)). ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud/): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1300 engineers are already using it! 2022-12-05T16:07:08+00:00 netdata v1.38.0 netdata v1.38.0 2023-02-06T16:05:20+00:00 - [Release Highlights](#v1380-release-highlights) - **[DBENGINE v2](#v1380-dbenginev2)**<br/> The new open-source database engine for Netdata Agents, offering huge performance, scalability and stability improvements, with a fraction of memory footprint! - **[FUNCTION: Processes](#v1380-functions)**<br/> Netdata beyond metrics! We added the ability for **runtime functions**, that can be implemented by any data collection plugin, to offer unlimited visibility to anything, even not-metrics, that can be valuable while troubleshooting. - **[Events Feed](#v1380-feed)**<br/> Centralized view of Space and Infrastructure level events about topology changes and alerts. - **[NOTIFICATIONS: Slack, PagerDuty, Discord, Webhooks](#v1380-notifications)**<br/> Netdata Cloud now supports **Slack**, **PagerDuty**, **Discord**, **Webhooks**. - **[Role-based access model](#v1380-rbac)**<br/> Netdata Cloud supports more roles, offering finer control over access to infrastructure. - [Integrations](#v1380-integrations)<br/> New and improved plugins for data collection, alert notifications, and data exporters. - [Collectors](#v1380-collectors) - [Notifications](#v1380-notifications) - [Exporters](#v1380-exporters) - [Health Monitoring and Alerts Notification Engine](#v1380-health)<br/> Changes to the Netdata Health Monitoring and Notifications engine. - [Visualizations / Charts and Dashboards](#v1380-visualization) - [Database](#v1380-database) - [Streaming and Replication](#v1380-streaming) - [API](#v1380-api) - [Machine Learning](#v1380-ml) - [Installation and Packaging](#v1380-packaging) - [Documentation and Demos](#v1380-documentation) - [Administration](#v1380-administration) - [Other Notable Changes](#v1380-other) - [Deprecation notice](#v1380-deprecation) - [Netdata Agent release meetup](#v1380-release-meetup) - [Support options](#v1380-support-options) - [Acknowledgements](#v1380-ack) > ❗We are keeping our codebase healthy by removing features that are end-of-life. Read the [deprecation notice](#v1380-deprecation) to check if you are affected. ### Netdata open-source growth <!-- Retrieve most of these stats from netdata/netdata/README.md badges --> - Almost 62,000 GitHub Stars - Over four million monitored servers - Almost 88 million sessions served - Over 600 thousand total nodes in Netdata Cloud ## Release highlights <a id="v1380-release-highlights"></a> ### Dramatic performance and stability improvements, with a smaller agent footprint <a id="v1380-dbenginev2"></a> We completely reworked our custom-made, time series database (dbengine), resulting in stunning improvements to performance, scalability, and stability, while at the same time significantly reducing the [agent memory requirements](https://github.com/netdata/netdata/tree/master/database/engine#memory-requirements). On production-grade hardware (e.g. 48 threads, 32GB ram) Netdata Agent Parents can easily collect 2 million points/second while servicing data queries for 10 million points / second, and running ML training and Health querying 1 million points / second each! For standalone installations, the 64bit version of Netdata runs stable at about 150MB RAM (Reside Set Size + SHARED), with everything enabled (the 32bit version at about 80MB RAM, again with everything enabled). ![image](https://user-images.githubusercontent.com/2662304/212779720-e448f6cd-a60a-48d0-930d-f2569bcd6fc6.png) <details> <summary>Read more about the changes over dbengine v1</summary> #### Key highlights of the new dbengine ##### Disk based indexing We introduced a new journal file format (`*.jnfv2`) that is way faster to initialize during loading. This file is used as a disk-based index for all metric data available on disk (metrics retention), reducing the memory requirements of dbengine by about 80%. ##### New caching 3 new **caches** (main cache, open journal cache, extent cache) have been added to speed up queries and control the memory footprint of dbengine. These caches combined, offer excellent caching even for the most demanding queries. Cache hit ratio now rarely falls bellow 50%, while for the most common use cases, it is constantly above 90%. The 3 caches support **memory ballooning** and autoconfigure themselves, so they don't require any user configuration in `netdata.conf`. At the same time, their memory footprint is **predictable**: twice the memory of the currently collected metrics, across all tiers. The exact equation is: ``` METRICS x 4KB x (TIERS - 1) x 2 + 32MB ``` Where: - `METRICS x 4KB x TIERS` is the size of the concurrently collected metrics. - `4KB` is the page size for each metric. - `TIERS` is whatever configured for `[db].storage tiers` in `netdata.conf`; use `(TIERS - 1)` when using 3 tiers or more (3 is the default). - `x 2 + 32MB` is the commitment of the new dbengine. The new combination of caches makes Netdata memory footprint **independent of retention**! The amount of metric data on disk, does not any longer affect the memory footprint of Netdata, it can be just a few MB, or even hundreds of GB! The caches try to keep the memory footprint at 97% of the predefined size (i.e. twice the concurrently collected metrics size). They automatically enter a survival mode when memory goes above this, by paralleling LRU evictions and metric data flushing (saving to disk). This system has 3 distinct levels of operation: - **aggressive evictions**, when caches are above 99% full; in this mode cache query threads are turned into page evictors, trying to remove the least used data from the caches. - **critical evictions**, when caches are above 101% full; in this mode every thread that accesses the cache is turned into a batch evictor, not leaving the cache until the cache size is again within acceptable limits. - **flushing critical**, when too many unsaved data reside in memory; in this mode, flushing is parallelized, trying to push data to disk as soon as possible. The caches are now shared across all dbengine instances (all tiers). LRU evictions are now smarter: the caches know when metrics are referenced by queries or by collectors and favor the ones that have been used recently by data queries. ##### New dbengine query engine The new dbengine query engine is totally asynchronous, working in parallel while other threads are processing metrics points. Chart and Context queries, but also Replication queries, now take advantage of this feature and ask dbengine to preload metric data in advance, before they are actually needed. This makes Netdata amazingly fast to respond in data queries, even on busy parent that at the same time collect millions of points. At the same time we support prioritization of queries based on their nature: - **High priority queries**, are all those that can potentially block data collection. Such queries are **tiers backfilling** and the last replication query for each metric (immediately after which, streaming is enabled). - **Normal priority queries**, are the ones that are initiated by users. - **Low priority queries**, are the ones that can be delayed without affecting quality of the results, like Health and Replication queries. - **Best effort queries**, are the lowest priority ones and are currently used by ML training queries. Starvation is prevented by allowing 2% of lower priority queries for each higher priority queue. So, even when backfilling is performed full speed at 15 million points per second, user queries are satisfied up to 300k points per second. Internally all caches are partitioned to allow parallelism up to the number of cores the system has available. On busy parents with a lot of data and capable hardware it is now easy for Netdata to respond to queries using 10 million points per second. At the same time, **extent deduplication** has been added, to prevent the unnecessary loading and uncompression of an extent multiple times in a short time. This works like this: while a request to load an extent is in flight, and up to the time the actual extent has been loaded and uncompressed in memory, more requests to extract data from it can be added to the same in flight request! Since dbengine trying to keep metrics of the same charts to the same extent, combined with the feature we added to prepare ahead multiple queries, this extent deduplication now provides hit of above 50% for normal chart and context queries! ##### Metrics registry A new **metrics registry** has been added that maintains an index of all metrics in the database, for all tiers combined. Initialization is the metrics registry is fully multithreaded utilizing all the resources available on busy parents, improving start-up times significantly. This metrics registry is now the only memory requirement related to retention. It keeps in memory the first and the last timestamps, along with a few more metadata, of all the metrics for which retention is available on disk. The metrics registry needs about 150 bytes per metric. #### Streaming <a id="v1380-stream"></a> The biggest change in streaming is that the parent agents now inherit the clock of their children, for their data. So, all timestamps about collected metrics reflect the timestamps on the children that collected them. If a child clock is ahead of the parent clock, the parent will still accept collected points for the child, and it will process them and save them, but on parent node restart the parent will refuse to load future data about a child database. This has been done in such a way that if the clock of the child is fixed (by adjusting it backwards), after a parent restart the child will be able to push fresh metrics to the parent again. Related to the memory footprint of the agent, streaming buffers were ballooning up to the configured size and remained like that for the lifetime of the agent. Now the streaming buffers are increased to satisfy the demand, but then they are again decreased to a minimum size. On busy parents this has a significant impact on the overall memory footprint of the agent (10MB buffer per node x 200 child nodes on this parent, is 2GB - now they return to a few KB per node). Active-Active parent clusters are now more reliable by detecting stale child connections and disconnecting them. Several child to parent connection issues have been solved. #### Replication <a id="v1380-repl"></a> Replication now uses the new features of dbengine and pipelines queries preparation and metric data loading, improving drastically its performance. At the same time, the replication step is now automatically adjusted to the page size of dbengine, allowing replication to use the data are already loaded by dbengine and saving resources at the next iteration. A single replication thread can now push metrics at a rate of above 1 million points / second on capable hardware. Solved an issue with replication, where if the replicated time-frame had a gap at the beginning of the replicated period, then no replication was performed for that chart. Now replication skips the gap and continues replicating all the points available. Replication does not replicate now empty points. The new dbengine has controls in place to insert gaps into the database which metrics are missing. Utilizing this feature, we have now stopped replicating empty points, saving bandwidth and processing time. Replication was increasing the streaming buffers above the configured ones, when big replication messages had to fit in it. Now, instead of increasing the streaming buffers, we interrupt the replication query at a point that the buffer will be sufficient to accept the message. When queries are interrupted like this, the remaining query is then repeated until all of it executed. Replication and data collection are now synchronized atomically at the sending side, to ensure that the parent will not have gaps at the point the replication ends and streaming starts. Replication had discrepancies when the db mode was not `dbengine`. To solve these discrepancies, combined with the storage layer API changes introduced by the new dbengine, we had to rewrite them to be compliant. Replication can now function properly, without gaps at the parents, even when the child has db mode `alloc`, `ram`, `save` or `map`. #### Netdata startup and shutdown Several improvements have been performed to speed up agent startup and shutdown. Combined with the new dbengine, now Netdata starts instantly on single node installations and uses just a fraction of the time that was needed by the previous stable version, even on very busy parents with huge databases (hundreds of GB). Special care has been taken to ensure that during shutdown the agent prioritizes dbengine flushing to disk of any unsaved data. So, now during shutdown, data collection is first stopped and then the hot and dirty pages of the main cache are flushed to disk before proceeding with other cleanup activities. </details> ### Functions <a id="v1380-functions"></a> After the groundwork done on the Netdata Agent in v1.37.0, Netdata Agent collectors are able to expose functions that can be executed on-demand, at run-time, by the data collecting agent, even when queries are executed via a Netdata Agent Parent. We are now utilizing this capability to provide the first of many powerful features via the Netdata Cloud UI. Netdata Functions on Netdata Cloud allow you to trigger specific routines to be executed by a given Agent on request. These routines can range from a simple reader that fetches real time information to help you troubleshoot (like the list of currently running processing, currently running db queries, currently open connections, etc.), to routines that trigger an action on your behalf (restart a service, rotate logs, etc.), directly on the node. The key point is to remove the need to open an ssh connection to your node to execute a command like `top` while you are troubleshooting. The routines are triggered directly from the Netdata Cloud UI, with the request going through the secure, already established by the agent [Agent-Cloud Link (ACLK)](https://learn.netdata.cloud/docs/agent/aclk). Moreover, unlike many of the commands you'd issue from the shell, Netdata Functions come with powerful capabilities like auto-refresh, sorting, filtering, search and more! And, as everything about Netdata, they are fast! #### What functions are currently available? At the moment, just one, to display detailed information on the currently running processes on the node, replacing `top` and `iotop`. The function is provided by the [apps.plugin](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md) collector. ![chrome_tABzCnU6BP](https://user-images.githubusercontent.com/82235632/215847598-ebfa80f9-58e8-4538-ba36-daddd7e3ea58.gif) <details> <summary>Read more about the Netdata Functions</summary> #### How do functions work? The nitty-gritty details are in PR "Allow netdata plugins to expose functions for querying more information about specific charts" ([#13720](https://github.com/netdata/netdata/pull/13720)). In short: - Each plugin can register to the main Netdata agent process a set of functions that it supports for the host it runs (global functions), or a given chart (chart local functions), along with the acceptable parameters and parameter values for each one. The plugin also defines the format of the response it will provide, if a certain function is called. - The agent makes the information available via its API, but also returns the available functions for a chart in the response of every `data` query call, that returns the metric values. - To execute a registered function, one needs to call the `/api/v1/functions` endpoint ([see it in swagger](https://editor.swagger.io/?url=https://raw.githubusercontent.com/netdata/netdata/master/web/api/netdata-swagger.yaml)). However, for security reasons, the specific call is protected, meaning it is disabled from the HTTP API and will return a 403. Only the cloud can call the particular endpoint and only via the secure and protected Agent-Cloud Link (ACLK). - When the endpoint is called, the agent daemon invokes the requested function on the collector via [a new plugins.d API endpoint](https://github.com/netdata/netdata/tree/master/collectors/plugins.d#function). Note that the `plugins.d` API has for the first time become bidirectional, precisely to support the daemon querying this type of information. #### How do functions work with streaming? The definitions of functions are transmitted to parent nodes via streaming, so that the parents know all the functions available on all child database they maintain. This works even across multiple levels of parents. When a parent node is connected to Netdata Cloud, it is capable of triggering the call to the respective child node, to run any of its functions. When multiple parents are involved, all of them will propagate the request to the right child to execute the function. #### Why are they available only on Netdata Cloud? Since these functions are able to execute routines on the node and expose information beyond metric data (even action buttons could be implemented using functions), our concern is to ensure no sensitive information or disruptive actions are exposed through the unprotected Agent's API. Since Netdata Cloud provides all the infrastructure to authenticate users, assign roles to them and establishes a secure communication channel to Netdata Agents [ACLK](https://github.com/netdata/netdata/blob/master/aclk/README.md), this concern is addressed. Netdata Cloud is free forever for everyone, providing a lot more than just the agent dashboard and is our main focus of development for new visualization features. #### Next steps For even more details please check [our docs](https://learn.netdata.cloud/docs/nightly/concepts/netdata-functions). If you have ideas or requests for other functions: * Participate in the relevant [GitHub Discussion](https://github.com/netdata/netdata/discussions/14412) * Open a [feature request](https://github.com/netdata/netdata-cloud/issues/new?assignees=&labels=feature+request%2Cneeds+triage&template=FEAT_REQUEST.yml&title=%5BFeat%5D%3A+) on the Netdata Cloud repo * Engage with our community on the [Netdata discord server](https://discord.com/invite/mPZ6WZKKG2). </details> ### Events feed<a id="v1380-feed"></a> *Coming by Feb 15th* The **Events feed** is a powerful new feature that tracks events that happen on your infrastructure, or in your Space. The feed lets you investigate events that occurred in the past, which is obviously invaluable for troubleshooting. Common use cases are ones like when a node goes offline, and you want to understand what events happened before that. A detailed event history can also assist in attributing sudden pattern changes in a time series to specific changes in your environment. We start from humble beginnings, capturing [topology events](#v1380-topology-events) (node state transitions) and [alert state transitions](#v1380-alert-events). We intend to expand the events we capture to include infrastructure changes like deployments or services starting/stopping and we plan to provide a way to display the events in the standard Netdata charts. #### What are the available events? > ⚠️ Based on your space's plan different allowances are defined to query past data. The length of the history is provided in this table: | **Domains of events** | **Community** | **Pro** | **Business** | | :-- | :-- | :-- | :-- | | **[Topology events](#v1380-topology-events)** <p>Node state transition events, e.g. live or offline.</p>| 4 hours | 7 days | 14 days | | **[Alert events](#v1380-alert-events)** <p>Alert state transition events, can be seen as an alert history log.</p>| 4 hours | 7 days | 90 days | ##### Topology events <a id="v1380-topology-events"></a> | **Event name** | **Description** | | :-- | :-- | | Node Became Live | The node is collecting and streaming metrics to Cloud.| | Node Became Stale | The node is offline and not streaming metrics to Cloud. It can show historical data from a parent node. | | Node Became Offline | The node is offline, not streaming metrics to Cloud and not available in any parent node.| | Node Created | The node is created, but it is still Unseen on Cloud, didn't establish a successful connection yet.| | Node Removed | The node was removed from the Space through the `Delete` action, if it becomes Live again it will be automatically added. | | Node Restored | The node is restored, if node becomes Live after a remove action it is re-added to the Space. | | Node Deleted | The node is deleted from the Space, see this as an hard delete and won't be re-added to the Space if it becomes live. | | Agent Claimed | The agent was successfully registered to Netdata Cloud and is able to connect. | | Agent Connected | The agent connected to the Netdata Cloud MQTT server (Agent-Cloud Link established). | | Agent Disconnected | The agent disconnected from the Netdata Cloud MQTT server (Agent-Cloud Link severed). | | Agent Authenticated | The agent successfully authenticated itself to Netdata Cloud. | | Agent Authentication Failed | The agent failed to authenticate itself to Netdata Cloud. | ##### Alert events <a id="v1380-alert-events"></a> | **Event name** | **Description** | | :-- | :-- | | Node Alert State Changed | These are node alert state transition events and can be seen as an alert history log. You can see transitions to or from any of these states: Cleared, Warning, Critical, Removed, Error or Unknown | ### Additional alert notification methods on Netdata Cloud <a id="v1380-notifications"></a> *Coming by Feb 15th* Every Netdata Agent comes with hundreds of pre-installed health alerts designed to notify you when an anomaly or performance issue affects your node or the applications it runs. All these events, from all your nodes, are centralized at Netdata Cloud. Before this release, Netdata Cloud was only dispatching centralized email alert notifications to your team whenever an alert enters a warning, critical, or unreachable state. However, the agent supported tens of notification delivery methods, which we hadn't provided via the cloud. We are now adding to Netdata Cloud more alert notification integration methods. We categorize them similarly to our [subscription plans](#v1380-paidplans), as Community, Pro and Business. On this release, we added **Discord** (Community Plan), **web hook** (Pro Plan), **PagerDuty** and **Slack** (Business Plan). ![chrome_2M3bGJxVTS](https://user-images.githubusercontent.com/82235632/215892088-cc82043b-b2c5-47b8-9a76-99a6f3cc8950.gif) <details> <summary>More details on notifications</summary> #### Notification method availability > ⚠️ Netdata Cloud notification methods availability depends on your [subscription plan](#v1380-paidplans). | **Notification methods** | **Community** | **Pro** | **Business** | | :-- | :--: | :--: | :--: | | Email | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | Discord | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | Web hook | - | :heavy_check_mark: | :heavy_check_mark: | | PagerDuty | - | - | :heavy_check_mark: | | Slack | - | - | :heavy_check_mark: | #### Notification method types Notification integrations are classified based on whether they need to be configured per user (**Personal** notifications), or at the system level (**System** notifications). Email notifications are **Personal**, meaning that administrators can enable or disable them globally, and each user can enable or disable them for them, per room. Email notifications are sent to the destination of the channel which is a user-specific attribute, e.g. user's e-mail. The users are the ones who can manage what specific configurations they want for the Space / Room(s) and the desired Notification level, via their User Profile page under **Notifications**. All other introduced methods are classified as **System**, as the destination is a target that usually isn't specific to a single user, e.g. slack channel. These notification methods allow for fine-grain rule settings to be done by administrators. Administrators are able to specify different targets depending on Rooms or Notification level settings. For more details please check the documentation [here](https://learn.netdata.cloud/docs/nightly/operations/alerts/alert-notifications). </details> ### Improved role-based access model <a id="v1380-rbac"></a> *Coming by Feb 15th* Netdata Cloud already provides a role-based-access mechanism, that allows you to control what functionalities in the app users can access. Each user can be assigned only one role, which fully specifies all the capabilities they are afforded. With the advent of the [paid plans](#v1380-paidplans) we revamped the roles to cover needs expressed by our users, like providing more limited access to your customers, or being able to join any room. We also aligned the offered roles to the target audience of each plan. The end result is the following: | **Role** | **Community** | **Pro** | **Business** | | :-- | :--: | :--: | :--: | | **Administrators** <p>This role allows users to manage Spaces, War Rooms, Nodes, Users, and Plan & Billing settings.</p><p>Provides access to all War Rooms in the space</p> | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | **Managers** <p>This role allows users to manage War Rooms and Users. </p><p>Provides access to all War Rooms and Nodes in the space.</p> | - | - | :heavy_check_mark: | | **Troubleshooters** <p>This role is for users focused on using Netdata to troubleshoot, not manage entities.</p><p>Provides access to all War Rooms and Nodes in the space.</p> | - | :heavy_check_mark: | :heavy_check_mark: | | **Observers** <p>This role is for read-only access, with restricted access to explicitly defined War Rooms and only the Nodes that appear in those War Rooms.</p><p> 💡 Ideal for restricting your customer's access to their own dedicated rooms.</p> | - | - | :heavy_check_mark: | | **Billing** <p>This role is for users that only need to manage billing options and see invoices, with no further access to the system.</p> | - | - | :heavy_check_mark: | ![image](https://user-images.githubusercontent.com/82235632/215895022-97e8f93d-8c47-4c22-bfae-d25f9bd80a40.png) ## Integrations <a id="v1380-integrations"></a> ### Collectors <a id="v1380-collectors"></a> #### Proc The [proc plugin](https://learn.netdata.cloud/docs/collect/system-metrics) gathers metrics from the `/proc` and `/sys` folders in Linux systems, along with a few other endpoints, and is responsible for the bulk of the system metrics collected and visualized by Netdata. It collects CPU, memory, disks, load, networking, mount points, and more. We added a "cpu" label to the per core utilization % charts. Previously, the only way to filter or group by core was to use the "instance", i.e. the chart name. The new label makes the displayed dimensions much more user-friendly. We [fixed](https://github.com/netdata/netdata/pull/14255) the issues we had with collection of CPU/memory metrics when running inside an LXC container as a `systemd` service. We also [fixed](https://github.com/netdata/netdata/pull/14252) the missing network stack metrics, when IPv6 is disabled. Finally, we improved how the `loadavg` alerts behave when the number of processors [is 0](https://github.com/netdata/netdata/pull/14286), or [unknown](https://github.com/netdata/netdata/pull/14265). #### Apps The [apps plugin](https://learn.netdata.cloud/docs/collect/application-metrics) breaks down system resource usage to processes, users and user groups, by reading whole process tree, collecting resource usage information for every process found running. We [fixed](https://github.com/netdata/netdata/pull/14156) the `nodejs` application group `node`, which incorrectly included `node_exporter`. The rule now is that the process must be called `node` to be included in that group. We also [added a telegraf application group](https://github.com/netdata/netdata/pull/14188). #### Containers and VMs (CGROUPS) The [cgroups plugin](https://learn.netdata.cloud/docs/agent/collectors/cgroups.plugin) reads information on Linux Control Groups to monitor containers, virtual machines and systemd services. The "net" section in a `cgroups` container would occasionally pick the wrong / random interface name to display in the navigation menu. We [removed the interface name](https://github.com/netdata/netdata/pull/14174) from the `cgroup` "net" family. The information is available in the cloud as labels and on the agent as chart names and ids. #### eBPF The [eBPF plugin](https://learn.netdata.cloud/docs/agent/collectors/ebpf.plugin) helps you troubleshoot and debug how applications interact with the Linux kernel. We [improved](https://github.com/netdata/netdata/pull/14270) the speed and resource impact of the collector shutdown, by reducing the number of threads running in parallel. We fixed a bug with eBPF routines that would sometimes cause kernel panic and system reboot on RedHat 8.* family OSs. [#14090](https://github.com/netdata/netdata/pull/14090), [#14131](https://github.com/netdata/netdata/pull/14131) We [fixed](https://github.com/netdata/netdata/pull/14131) an `ebpf.d` crash: `sysmalloc` Assertion failed, then killed with `SIGTERM`. We [fixed](https://github.com/netdata/netdata/pull/14131) a crash when building eBPF while using a memory address sanitizer. The eBPF collector also creates charts for each running application through an integration with the `apps.plugin`. This integration helps you understand how specific applications interact with the Linux kernel. In systems with many VMs (like Proxmox), this integration can cause a large load. We used to have the integration turned on by default, with the ability to disable it from `ebpf.d.conf`. We have now done the opposite, having the integration disabled by default, with the ability to enable it. [#14147](https://github.com/netdata/netdata/pull/14147) #### Windows Monitoring We have been making tremendous improvements on how we [monitor Windows Hosts](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/wmi). The work will be completed in the next release. For now, we can say that we have done some preparatory work by [adding more info to existing charts](https://github.com/netdata/netdata/pull/14001), adding metrics for [MS SQL Server](https://github.com/netdata/go.d.plugin/pull/1041), [IIS](https://github.com/netdata/go.d.plugin/pull/972) in 1.37, [Active Directory](https://github.com/netdata/go.d.plugin/pull/1003), [ADFS](https://github.com/netdata/go.d.plugin/pull/1013) and [ADCS](https://github.com/netdata/go.d.plugin/pull/1007). We also [reorganized the navigation menu](https://github.com/netdata/go.d.plugin/pull/1065), so that Windows application metrics don't appear under the generic "WMI" category, but on their own category, just like Linux applications. We invite you to try out with these collectors either from a remote Linux machine, or using our new [MSI installer](https://github.com/netdata/msi-installer), which however is not suitable for production. Your feedback will be really appreciated, as we invest on making Windows Monitoring a first class citizen of Netdata. #### Generic Prometheus Endpoint Monitoring Our [Generic Prometheus Collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus) gathers metrics from any [Prometheus](https://prometheus.io/) endpoint that uses the [OpenMetrics exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/). To allow better grouping and filtering of the collected metrics we now [create a chart with labels per label set](https://github.com/netdata/go.d.plugin/pull/1004). We also [fixed the handling of Summary/Histogram NaN values](https://github.com/netdata/go.d.plugin/pull/1027). #### TCP endpoint monitoring The [TCP endpoint (portcheck) collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/portcheck) monitors TCP service availability and response time. We [enriched](https://github.com/netdata/netdata/pull/14137) the `portcheck` alarms with labels that show the problematic host and port. #### HTTP endpoint monitoring The [HTTP endpoint monitoring collector (httpcheck)](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/httpcheck) monitors their availability and response time. We [enriched the alerts](https://github.com/netdata/netdata/pull/14133) with labels that show the slow or unavailable URL relevant to the alert. #### Host reachability (ping) The new [host reachability collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/ping) replaced `fping` in v1.37.0. We [removed](https://github.com/netdata/netdata/pull/14073) the deprecated `fping.plugin`, in accordance with the v1.37.0 deprecation notice. #### RabbitMQ The [RabbitMQ collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/rabbitmq) monitors the open source message broker, by querying its `overview`, `node` and `vhosts` HTTP endpoints. We [added monitoring of the RabitMQ queues](https://github.com/netdata/go.d.plugin/pull/1047) that was available in the older Python module and [fixed an issue](https://github.com/netdata/go.d.plugin/pull/1052) with the new metrics. #### MongoDB We monitor the [MongoDB](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mongodb) NoSQL database [serverStatus](https://www.mongodb.com/docs/manual/reference/command/serverStatus/#mongodb-dbcommand-dbcmd.serverStatus) and [dbStats](mongodb.com/docs/manual/reference/command/dbStats/#dbstats). To allow better grouping and filtering of the collected metrics we now [create a chart per database, repl set member, shard and additional metrics](https://github.com/netdata/go.d.plugin/pull/1042). We also [improved](https://github.com/netdata/go.d.plugin/pull/1046) the `cursors_by_lifespan_count` chart dimension names, to make them clearer. #### PostgreSQL Our powerful [PostgreSQL database collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/postgressql) has been enhanced with an improved [WAL replication lag calculation](https://github.com/netdata/go.d.plugin/pull/1039) and [better support of versions before 10](https://github.com/netdata/go.d.plugin/pull/1018). #### Redis The [Redis collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis) monitors the in-memory data structure store via its [INFO ALL](https://redis.io/commands/info/) command. We now support password protected Redis instances, by [allowing users to set the username/password](https://github.com/netdata/go.d.plugin/pull/1051) in the collector configuration. #### Consul The [Consul collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/consul) is production ready! [Consul by HashiCorp](https://www.consul.io/) is a powerful and complex identity-based networking solution, which is not trivial to monitor. We were lucky to have the assistance of HashiCorp itself in this endeavor, which resulted in a monitoring solution of exceptional quality. Look for common blog posts and announcements in the coming weeks! #### NGINX Plus The [NGINX Plus collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginxplus) monitors the load balancer, API gateway, and reverse proxy built on top of NGINX, by utilizing its [Live Activity Monitoring](https://docs.nginx.com/nginx/admin-guide/monitoring/live-activity-monitoring/) capabilities. We improved the collector that was launched last November with [additional information](https://github.com/netdata/netdata/pull/14080) explaining the charts and the [addition of SSL error metrics](https://github.com/netdata/go.d.plugin/pull/1010). #### Elastic Search The [Elastic Search collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/elasticsearch) monitors the search engine's instances via several of the provided local interfaces. To allow better grouping and filtering of the collected metrics we now [create a chart per node index, a dimension per health status](https://github.com/netdata/go.d.plugin/pull/1040). We also [added several OOB alerts](https://github.com/netdata/netdata/pull/14197). #### NVIDIA GPU Our [NVIDIA GPU Collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nvidia_smi) monitors memory usage, fan speed, PCIE bandwidth utilization, temperature, and other GPU performance metrics using the `nvidia-smi` cli tool. Multi-Instance GPU (MIG) is a feature from NVIDIA that lets users partition a single GPU to smaller GPU instances. We [added MIG metrics](https://github.com/netdata/go.d.plugin/pull/1067) for uncorrectable errors and memory usage. We also [added metrics for voltage](https://github.com/netdata/go.d.plugin/pull/1048) and [PCIe bandwidth utilization percentage](https://github.com/netdata/netdata/pull/14315). Last but not least, we significantly improved the collector's performance, by switching to [collecting data using the CSV format](https://github.com/netdata/go.d.plugin/pull/1023). #### Pi-hole We monitor [Pi-hole](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/pihole), the Linux network-level advertisement and Internet tracker blocking application via its [PHP API](https://github.com/pi-hole/AdminLTE). We [fixed](https://github.com/netdata/go.d.plugin/pull/1037) an issue with the requests failing against an authenticated API. #### Network Time Protocol (NTP) daemon The ntpd program is an operating system daemon which sets and maintains the system time of day in synchronism with Internet standard time-servers ([man page](https://linux.die.net/man/8/ntpd)). We rewrote our previous python.d collector in go, improving its performance and maintainability. The new collector still monitors the system variables of a local `ntpd` daemon and optionally the variables of its polled peers. Similarly to `ntpq`, the [standard NTP query program](http://doc.ntp.org/current-stable/ntpq.html), we used the NTP Control Message Protocol over a UDP socket. The python collector [will be deprecated in the next release](#v1380-deprecation), with no effect on current users. ### Notifications <a id="v1380-collectors"></a> See [Additional alert notification methods on Netdata Cloud](#v1380-notifications) The agents can now [send notifications to Mattermost](https://github.com/netdata/netdata/pull/14153), using the Slack integration! [Mattermost](https://mattermost.com/) has a [Slack-compatible API](https://jeffschering.github.io/mmdocs/monolith/developer/api.html#incoming-webhooks) that only required a couple of additional parameters. Kudos to @je2555! ### Exporters <a id="v1380-exporters"></a> Netdata can [export and visualize Netdata metrics in Graphite](https://learn.netdata.cloud/guides/export/export-netdata-metrics-graphite). Our exporter was broken in v1.37.0 due to our host labels for ephemeral nodes. we fixed the issue with [#14105](https://github.com/netdata/netdata/pull/14105). ## Alerts and Notification Engine <a id="v1380-health"></a> ### Health Engine To improve performance and stability, we made [health run in a single thread](https://github.com/netdata/netdata/pull/14244). ### Notifications Engine The agent alert notifications are controlled by the configuration file [health_alarm_notify.conf](https://github.com/netdata/netdata/blob/master/health/notifications/health_alarm_notify.conf). Previously, if one used the `|critical` modifier, the recipients would always get at least 2 notifications: critical and clear. There was no way how to stop sending clear/warning notifications afterwards. We [added](https://github.com/netdata/netdata/pull/14330) the `|nowarn` and `|noclear` notification modifiers, to allow users to really receive just the transitions to the critical state. We also [fixed the broken redirects from alert notifications to cleared alerts](https://github.com/netdata/netdata-cloud/issues/656). ### Alerts #### Chart labels in alerts We constantly strive to improve the clarity of the information provided by the hundreds of out of the box alerts we provide. We can now provide more fine-tuned information on each alert, as we [started using specific chart labels instead of `family`](https://github.com/netdata/netdata/pull/14173). To provide the capability we also had to [change the format of alert info variables](https://github.com/netdata/netdata/pull/14206) to support the more complex syntax. #### Globally enable/disable specific alerts Administrators can now globally, permanently disable specific OOB alerts via `netdata.conf`. Previously the options where to [edit individual alert configuration files](https://learn.netdata.cloud/docs/monitor/configure-alarms), or to use the [health management API](https://learn.netdata.cloud/docs/agent/web/api/health#health-management-api). The `[health]` section of `netdata.conf` now support the setting `enabled_alarms`. It's value defines which alarms to load from both user and stock directories. The value is a [simple pattern](/libnetdata/simple_pattern/README.md) list of alarm or template names, with the default value of `*`, meaning that all alerts are loaded. For example, to disable specific alarms, you can provide `enabled alarms = !oom_kill *`, which will load all alarms except `oom_kill`. ## Visualizations / Charts and Dashboards <a id="v1380-visualization"></a> Our main focus for visualization is on the Netdata Cloud **Overview** dashboard. This dashboard is our flagship, on which everything we do, all slicing and dicing capabilities of Netdata, are added and integrated. We are working hard to make this dashboard powerful enough, so that the need to learn a query language for configuring and customizing monitoring dashboards, will be eliminated. On this release, we virtualized all items on the dashboard, allowing us to achieve exceptional performance on page rendering. In previous releases there were issues on dashboards with thousands of charts. Now the number of items in the page is irrelevant! To make slicing and dicing of data easier, we ordered the on-chart selectors in a way that is more natural for most users: ![image](https://user-images.githubusercontent.com/2662304/216680995-807e5e52-571b-423e-a139-5e06c1938785.png) This bar above the chart now describes the data presented, in plain English: **On 6 out of 20 Nodes, group by dimension, the SUM() of 23 Instances, using All dimensions, each as AVG() every 3s** A tool-tip provides more information about the missing nodes: ![image](https://user-images.githubusercontent.com/2662304/216684441-ed39f836-39df-4347-85b2-095a00aa1249.png) And the drop-down menu now shows the exact nodes that contributed data to the query, together with a short explanation on why nodes did not provide any data: ![image](https://user-images.githubusercontent.com/2662304/216681780-eaa0cac3-ab72-4195-8323-e379376e5411.png) Additionally, the pop-out icon next to each node can be used to jump to the single node dashboard of this node. All the slicing and dicing controls (Nodes, Dimensions, Instances), now support filtering. As shown above, there is a search box in the drop-down and a tick-mark to the left of each item in the list, which can be used to instantly filter the data presented. At the same time, we re-worked most of the Netdata collectors to add labels to the charts, allowing the chart to be pivoted directly from the **group by** drop-down menu. On the following image, we see the same chart as above, but now the data have been grouped by the label `device`, the values of which became dimensions of the chart. ![image](https://user-images.githubusercontent.com/2662304/216682808-6622cde1-70f0-49dd-ac41-c9fa8d88fb08.png) The data can be instantly be filtered by original dimension (`reads` and `writes` in this example), like this: ![image](https://user-images.githubusercontent.com/2662304/216683340-de8c60b5-e66a-441f-9c5c-e45159680df3.png) or even by a specific instance (`disk` in this example), like this: ![image](https://user-images.githubusercontent.com/2662304/216683602-cf4cf1c2-5283-484d-97f6-e49787ed1ac4.png) On the Instances drop down list (shown above), the pop-out icon to the right of each instance can be used to quickly jump to the single node dashboard, and we also made this function automatically scroll the dashboard to relative chart's position and filter on that chart the specific instance from which the jump was made. Our goal is to polish and fine tune this interface, to the degree that it will be possible to slice and dice any data, without learning a query language, directly from the dashboard. We believe that this will simplify monitoring significantly, make it more accessible to people, and it will eventually allow all of us to troubleshoot issues without any prior knowledge of the underlying data structures. At the same time, we worked to improve switching between rooms and tabs within a room, by saving the last visible chart and the selected page filters, we are restored automatically when the user switches back to the same room and tab. For the ordering of the sections and subsections on the dashboard menu, we made a change to allow currently collected charts to overwrite the position of the section and subsection (we call it `priority`). Before this change, archived metrics (old metrics that are retained due to retention), were participating in the election of the `priority` for a section or subsection and because the retention Netdata maintains by default is more than a year, changes to the `priority` were never propagated to the UI. #### Bug fixes We fixed: - [The alignment of the anomaly rate pop-down chart](https://github.com/netdata/netdata-cloud/issues/662) - [The width of the right-hand menu bar](https://github.com/netdata/netdata-cloud/issues/704) - [A crash when filtering dimensions](https://github.com/netdata/netdata-cloud/issues/695) - [The warning when a user tries to leave the last space](https://github.com/netdata/netdata-cloud/issues/649) - [The filters of the Metric Correlation screen incorrectly persisting](https://github.com/netdata/netdata-cloud/issues/653) - [The wrong value being shown for whether a node has ML enabled](https://github.com/netdata/netdata-cloud/issues/692) - [The node filter on the anomalies tab](https://github.com/netdata/netdata-cloud/issues/688) - [The visibility of the chart actions menu that appears inside a chart](https://github.com/netdata/netdata-cloud/issues/648) - [Logstash metrics not being displayed in Netdata Cloud](https://github.com/netdata/netdata-cloud/issues/667) - [The home tab not being updated with the correct number of nodes, after deleting a node](https://github.com/netdata/netdata-cloud/issues/679) ### Real Time Functions See [Functions](#v1380-functions) ### Events Feed See [Events Feed](#v1380-feed). ## Database <a id="v1380-database"></a> ### New database engine See [Dramatic performance and stability improvements, with a smaller agent footprint](#v1380-dbenginev2) ### Metadata sync Saving metadata to SQLite is now faster. Metadata saving starts asynchronously when the agent starts and continues as long as there are metadata to be saved. We implemented optimizations by grouping queries into transactions. At runtime this grouping happens per chart, which on shutdown it happens per host. These changes made metadata syncing up to 4x faster. ## Streaming and Replication <a id="v1380-streaming"></a> We introduced very significant reliability and performance improvements to the streaming protocol and the database replication. See [Streaming](#v1380-stream), [Replication](#v1380-repl). At the same time, we fixed SSL handshake issues on established SSL connections, provide stable streaming SSL connectivity between Netdata agents. ## API <a id="v1380-api"></a> Data queries for charts and contexts now have the following additional features: 1. The query planner that decided which tier to use for each query, now prefers higher tiers, to speed up queries 2. Joining of multiple tiers to the same query now prefers higher resolution tiers and joining is accurate. To achieve that, behind the scenes the query planner expands the query of each tier to overlap with its previous and next and at the time they intersect, it reads points from all the overlapping tiers to decide how exactly the join should happen. 3. Data queries now utilize the parallelism of the new dbengine, to pipeline query preparation of the dimensions of the chart or context being queried, and then preloading metric data for dimensions that are in the pipeline. ## Machine Learning <a id="v1380-ml"></a> We have been busy at work under the hood of the Netdata agent to introduce new capabilities that let you extend the "training window" used by Netdata's [native anomaly detection capabilities](https://learn.netdata.cloud/docs/nightly/setup/configure-machine-learning-ml-powered-anomaly-detection). ![image](https://user-images.githubusercontent.com/43294513/217060868-1217b7b9-b6cb-4cba-b4cf-7571864ad4e8.png) We have [introduced a new ML parameter](https://learn.netdata.cloud/docs/nightly/setup/configure-machine-learning-ml-powered-anomaly-detection#descriptions-minmax) called `number of models per dimension` which will control the number of most recently trained models used during scoring. Below is some pseudo-code of how the trained models are actually used in producing [anomaly bits](https://learn.netdata.cloud/docs/nightly/setup/configure-machine-learning-ml-powered-anomaly-detection#anomaly-bit) (which give you an "[anomaly rate](https://learn.netdata.cloud/docs/nightly/setup/configure-machine-learning-ml-powered-anomaly-detection#anomaly-rate)" over any window of time) each second. ```python # preprocess recent observations into a "feature vector" latest_feature_vector = preprocess_data([recent_data]) # loop over each trained model for model in models: # if recent feature vector is considered normal by any model, stop scoring if model.score(latest_feature_vector) < dimension_anomaly_score_threshold: anomaly_bit = 0 break else: # only if all models agree the feature vector is anomalous is it considered anomalous by netdata anomaly_bit = 1 ``` The aim here is to only use those additional stored models when we need to. So essentially once one model suggests a feature vector looks anomalous we check all saved models and only when they all agree that something is anomalous does the anomaly bit get to be finally set to 1 to signal that Netdata considered the most recent feature vector unlike anything seen in all the models (spanning a wider training window) checked. Read more in [this blog post](https://blog.netdata.cloud/extending-anomaly-detection-training-window/)! We now [create ML charts on child hosts](https://github.com/netdata/netdata/pull/14207), when a parent runs a ML for a child. These charts use the parent's hostname to differentiate multiple parents that might run ML for a child. Finally, we [refactored the ML code and added support for multiple KMeans models](https://github.com/netdata/netdata/pull/14198). ## Installation and Packaging <a id="v1380-packaging"></a> ### New hosting of build artifacts <a id="v1380-build"></a> We are always looking to improve the ways we make the agent available to users. Where we host our build artifacts is an important piece of the puzzle, and we've taken some significant steps in the past couple of months. #### New hosting of nightly build artifacts <a id="v1380-nightlies"></a> As of 2023-01-16, our nightly build artifacts are being hosted as GitHub releases on the new https://github.com/netdata/netdata-nightlies/ repository instead of being hosted on Google Cloud Storage. In most cases, this should have no functional impact for users, and no changes should be required on user systems. #### New hosting of native package repositories <a id="v1380-packagerepos"></a> As part of improving support for our native packages, we are migrating off of Package Cloud to our own self-hosted package repositories located at https://repo.netdata.cloud/repos/. This new infrastructure provides a number of benefits, including signed packages, easier on-site caching, more rapid support for newly released distributions, and the ability to support native packages for a wider variety of distributions. Our RPM repositories [have already been fully migrated](https://github.com/netdata/netdata/discussions/14161) and the DEB repositories [are currently in the process of being migrated](https://github.com/netdata/netdata/discussions/14300). #### Official Docker images now available on GHCR and Quay <a id="v1380-dockerimgs"></a> In addition to Docker Hub, our official Docker images are now available on [GHCR](https://github.com/netdata/netdata/pkgs/container/netdata) and [Quay](https://quay.io/repository/netdata/netdata). The images are identical across all three registries, including using the same tagging. You can use our Docker images from GHCR or Quay by either configuring them as registries with your local container tooling, or by using `ghcr.io/netdata/netdata` or `quay.io/netdata/netdata` instead of `netdata/netdata`. ### kickstart The directives `--local-build-options` and `--static-install-options` used to only accept a single option each. We now [allow multiple options to be entered](https://github.com/netdata/netdata/pull/14287). We [renamed](https://github.com/netdata/netdata/pull/13881) the `--install` option to `--install-prefix`, to clarify that it affects the directory under which the Netdata agent will be installed. To help prevent user errors, passing an unrecognized option to the kickstart script [now results in a fatal error](https://github.com/netdata/netdata/pull/12943) instead of just a warning. We previously used `grep` to get some info on `login` or `group`, which could not handle cases with centralized authentication like Active Directory or FreeIPA or pure LDAP. We [now use "getent group"](https://github.com/netdata/netdata/pull/14316) to get the group information. ### RPMs We [fixed the required permissions](https://github.com/netdata/netdata/pull/14140) of the `cgroup-network` and `ebpf.plugin` in RPM packages. ### OpenSUSE We [fixed the binary package updates](https://github.com/netdata/netdata/pull/14260) that were failing with an error on "Zypper upgrade". ### FreeBSD We [fixed the missing required package installation of "tar"](https://github.com/netdata/netdata/pull/14095). ### MacOS We [fixed some crashes on MacOS](https://github.com/netdata/netdata/pull/14304). ### Proxmox Netdata on Proxmox virtualization management servers must be allowed to resolve VM/container names and read their CPU and memory limits. We now [explicitly add](https://github.com/netdata/netdata/pull/14168) the `netdata` user to the `www-data` group on Proxmox, so that users don't have to do it manually. ### Other We [fixed the path to "netdata.pid"](https://github.com/netdata/netdata/pull/14180) in the logrotate postrotate script, which causes some errors during log rotation. We also [added pre gcc v5 support](https://github.com/netdata/netdata/pull/14239) and allowed building without dbengine. ## Documentation and Demos <a id="v1380-documentation"></a> #### Learn We have been working hard to revamp [Netdata Learn](https://learn.netdata.cloud). We are revising not just its structure and content, but also the Continuous Integration processes around it. We're getting close to the finish line, but you may notice that we currently publish two versions; `1.37.x` is frozen with the state of the docs as of the 1.37.1 release, and the `nightly` version has the target experience. While not yet ready for production, the `nightly` version is the only place where information on the latest features and changes is available. The following screenshot shows how you can switch between versions. <img width="697" alt="image" src="https://user-images.githubusercontent.com/43294513/216357310-c1e8000d-d846-408f-906d-0557ef12266e.png"> Be aware that you may encounter some broken links or missing pages while we are sorting out the several hundred markdown documents and several thousand links they include. We ask for your patience and expect that by the next release we'll have properly launched the new, more easy to navigate and use version. #### Demo space The [Netdata Demo space](https://app.netdata.cloud/spaces/netdata-demo) on Netdata Cloud is constantly being updated with new rooms, for various use cases. You [don't even need a cloud account](https://github.com/netdata/netdata-cloud/issues/714) to see our powerful infrastructure monitoring in action, so what are you waiting for? [<img height=300 src="https://user-images.githubusercontent.com/43294513/216787761-9ade2d7e-69d9-47a8-b32f-363052b4f33c.png">](https://app.netdata.cloud/spaces/netdata-demo) ## Administration <a id="v1380-administration"></a> ### Logging We have improved the readability of our main error log file `error.log`, by [moving data collection specific log messages](https://github.com/netdata/netdata/pull/14309) to `collector.log`. For the same reason we [reduced the log verbosity of streaming connections](https://github.com/netdata/netdata/pull/14117). ### New configuration editing script We reimplemented the `edit-config` script we install in the user config directory, adding a few new features, and fixing a number of outstanding issues with the previous script. Overall changes from the existing script: - Error messages are now clearly prefixed with `ERROR:` instead of looking no different from other output from the script. - We now have proper support for command-line arguments. In particular, `edit-config --help` now properly returns usage information instead of throwing an error. Other supported options are `--file` for explicitly specifying the file to edit (using this is not required, but we should ideally encourage it), and `--editor` to specify an editor of choice on the command-line. - We now can handle editing configuration for a Docker container on the host side, instead of requiring it to be done in the container. This is done by copying the file out of the container itself. The script includes primitive auto-detection that should work in most common cases, but the user can also use the new `--container` option to bypass the auto-detection and explicitly specify a container ID or name to use. Supports both Docker and Podman. - Instead of templating in the user config directory at build time, the script now uses the directory it was run from as the target for copying stock config files to. This is required for the above-mentioned Docker support, and also makes it a bit easier to test the script without having to do a full build of Netdata. Users can still override this by setting `NETDATA_USER_CONFIG_DIR` in the environment, just like with the old script. - Similarly, instead of templating the stock config directory at build time, we now determine it at runtime by inspecting the `.environment` file created by the install, falling back first to inferring the location from the script’s path and if that fails using the ‘default’ of `/usr/lib/netdata/conf.d`. From a user perspective, this changes nothing for any type of install we officially support and for any third-party packages I know of. This results in a slight simplification of the build code, as well as making testing of the script much easier (you can now literally just copy it to the right place, and it should work). Users can still override this by setting `NETDATA_STOCK_CONFIG_DIR`. - Instead of listing all known files in the help text, we now require the user to run the script with the `--list` option. This has two specific benefits: - It ensures that the actual usage information won’t end up scrolled off the top of the screen by the list of known files. - It avoids the expensive container checks and stock config directory computation when the user just needs the help output. - We now do a quick check of the validity of the editor (either auto-detected or user-supplied) instead of just blindly trusting that it’s usable. This should not result in any user-visible changes, but will provide a more useful error message if the user mistypes the name of their editor of choice. - Instead of blindly excluding paths starting with `/` or `.`, we now do a proper prefix check for the supplied file path to make sure it’s under the user config directory. This provides tow specific benefits: - We no longer blindly copy files into directories that are not ours. For example, with the existing script, you can do `/etc/netdata/edit-config apps_groups.conf`, and it will blindly copy the stock `apps_groups.conf` file to the current directory. With the new script, this will throw an error instead. - Invoking the script using absolute paths that point under the user config directory will work properly. In particular, this means that you do not need to be in the user config directory when invoking the script, provided you use a proper path. Running `netdata/edit-config netdata/apps_groups.conf` when in `/etc` will now work, and `/etc/netdata/edit-config /etc/netdata/apps_groups.conf` will work from anywhere on the system. - If the requested file does not exist, and we do not provide a stock version of it, the script will now create an empty file instead of throwing an error. This is intended to allow it to behave better when dealing with configuration for third-party plugins (we may also want to define a standard location for third party plugins to store their stock configuration to improve this further, but that’s out of scope for this PR). ### Netdata Monitoring The new Netdata Monitoring section on our dashboard has dozens of charts detailing the operation of Netdata. All new components have their charts, dbengine, metrics registry, the new caches, the dbengine query router, etc. At the same time, we added a chart detailing the memory used by the agent and the function it is used for. This was the hardest to gather, since information was spread all over the place, but thankfully the internals of the agents have changed drastically in the last few months, allowing us to have a better visibility on memory consumption. At its heart, the agent is now mainly an array allocator (ARAL) and a dictionary (indexed and ordered lists of objects), carefully crafted to achieve their maximum performance when multithreaded. Everything we do, from data collection, to health, streaming, replication, etc., is actually business logic on top of these elements. ### CLI `netdatacli version` [now returns the version of netdata](https://github.com/netdata/netdata/pull/14094). ## Other Notable Changes <a id="v1380-other"></a> ### Netdata Paid Subscriptions <a id="v1380-paidplans"></a> *Coming by Feb 15th* At Netdata we take pride in our commitment to the principle of providing free and unrestricted access to high-quality monitoring solutions. We offer our free SaaS offering - what we call the **Community plan** - and the Open Source Agent, which feature unlimited nodes and users, unlimited metrics, and retention, providing real-time, high-fidelity, out-of-the-box infrastructure monitoring for packaged applications, containers, and operating systems. We also start providing paid subscriptions, designed to provide additional features and capabilities for businesses that need tighter and customizable integration of the free monitoring solution to their processes. These are divided into three different plans: **Pro**, **Business**, and **Enterprise**. Each plan offers a different set of features and capabilities to meet the needs of businesses of different sizes and with different monitoring requirements. You can change your plan at any time. Any remaining balance will be credited to your account, even for yearly plans. Netdata designed this in order to respect the unpredictability of world dynamics. Less anxiety about choosing the right commitments in order to save money in the long run. ![image](https://user-images.githubusercontent.com/82235632/215895884-be6c705b-8115-4ce9-a5ee-ec65b7ae8ccb.png) The paid Netdata Cloud plans work as subscriptions and overall consist of: * A flat fee component (price per space) * An on-demand, metered component, that is related to the usage of Netdata Cloud. For us, usage is directly linked to the number of nodes you have running, regardless of how many metrics each node collects. (see details below). Netdata provides two billing frequency options: * Monthly - Pay as you go, where we charge both the flat fee and the on-demand component every month * Yearly - Annual prepayment, where we charge upfront the flat fee and committed amount related to your estimated usage of Netdata (see details below) The detailed feature list and pricing in available in [netdata.cloud/pricing](https://www.netdata.cloud/pricing). <details> <summary>More details on usage pricing</summary> #### Running nodes and billing The only dynamic variable we consider for billing is the number of concurrently running nodes or agents. We only charge you for your active running nodes. We obviously **don't count offline nodes**, which were connected in a previous month and are currently offline, with their metrics unavailable. But we go further and **don't count stale nodes** either, which are available to query through a Netdata parent agent but are not actively collecting metrics at the moment. To ensure we don't overcharge any user due to sporadic spikes throughout a month or even at a certain point in a day we: * Calculate a **daily P90 count of your running nodes**. We take a daily snapshot of your running nodes, and using the node state change events (live, offline) we guarantee that a daily P90 figure is calculated to remove any temporary spikes within the day. * Do a running **P90 calculation from the start to the end of the monthly billing cycle**. This way, we guarantee that we remove spikes that happened in just a couple of days within a single month. :note: Even if you have a yearly billing frequency, we track the p90 counts monthly, to charge any potential overage over your committed nodes. #### Committed nodes When you subscribe to a Yearly plan you need to specify the number of nodes that you commit to. in addition to the discounted flat fee, you then get a 25% discount on the per node fee, as you're also committing to have those connected for a year. The charge for the committed nodes is part of your annual prepayment (`node discounted price x committed nodes x 12 months`). If in a given month your usage is over these committed nodes, we charge the undiscounted cost per node for the overage. </details> #### Agent-Cloud link support for authenticated proxies The Agent-Cloud link ([ACLK](https://learn.netdata.cloud/docs/agent/aclk)) is the mechanism responsible for securely connecting a Netdata Agent to your web browser through Netdata Cloud. The ACLK establishes an outgoing secure WebSocket (WSS) connection to Netdata Cloud on port `443`. The ACLK is encrypted, safe, and _is only established if you connect your node_. We have always supported unauthenticated HTTP proxies for the ACLK. We have now [added support for HTTP Basic authentication](https://github.com/netdata/netdata/pull/13762). We also [fixed a race condition on the ACLK query thread startup](https://github.com/netdata/netdata/pull/14164). ## Deprecation notice <a id="v1380-deprecation"></a> The following items will be removed in our next minor release (v1.39.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |--------------------------------------------------------------------------------------------------------------|:---------:|:----------------------------------------------------------------------------------------:| | [python.d/ntpd](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/ntpd) | collector | [go.d/ntpd](https://github.com/netdata/go.d.plugin/tree/master/modules/ntpd) | | [python.d/proxysql](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/proxysql) | collector | [go.d/proxysql](https://github.com/netdata/go.d.plugin/tree/master/modules/proxysql) | | [python.d/rabbitmq](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/rabbitmq) | collector | [go.d/rabbitmq](https://github.com/netdata/go.d.plugin/tree/master/modules/rabbitmq) | | [python.d/nvidia_smi](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/nvidia_smi) | collector | [go.d/nvidia_smi](https://github.com/netdata/go.d.plugin/tree/master/modules/nvidia_smi) | ### Deprecated in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.37.0#v1370-deprecation), the following items have been removed in this release: | Component | Type | Replaced by | |--------------------------------------------------------------------------------------------------------|:---------:|:----------------------------------------------------------------------------------:| | [python.d/dockerd](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/dockerd) | collector | [go.d/docker](https://github.com/netdata/go.d.plugin/tree/master/modules/docker) | | [python.d/logind](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/logind) | collector | [go.d/logind](https://github.com/netdata/go.d.plugin/tree/master/modules/logind) | | [python.d/mongodb](https://github.com/netdata/netdata/tree/v1.36.1/collectors/python.d.plugin/mongodb) | collector | [go.d/mongodb](https://github.com/netdata/go.d.plugin/tree/master/modules/mongodb) | | [fping](https://github.com/netdata/netdata/tree/v1.36.1/collectors/fping.plugin) | collector | [go.d/ping](https://github.com/netdata/go.d.plugin/tree/master/modules/ping) | <details> <summary>We also removed support for Fedora 35, OpenSuse Leap 15.3 and Fedora 36 ARMv7 native packages.</summary> - [Removed Fedora 35 from the list of supported platforms](https://github.com/netdata/netdata/pull/14136) - [Removed openSUSE Leap 15.3 from CI and support](https://github.com/netdata/netdata/pull/13416) - [Dropped ARMv7 native packages for Fedora 36](https://github.com/netdata/netdata/pull/14233) </details> ## Netdata Agent Release Meetup <a id="v1380-release-meetup"></a> Join the Netdata team on the **7th of February, at 17:00 UTC** for the [Netdata Agent Release Meetup](https://discord.gg/NQqdEbtn?event=1070676611785031720). Together we’ll cover: - Release Highlights. - Acknowledgements. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/291356599/) - we look forward to meeting you. ## Support options <a id="v1380-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it! ## Acknowledgements <a id="v1380-ack"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @je2555 for [enabling alert notifications to Mattermost](https://github.com/netdata/netdata/pull/14153) using Slack-compatible webhooks. - @rex4539 for [fixing various typos in documentation](https://github.com/netdata/netdata/pull/14194). - @artemsafiyulin for [fixing](https://github.com/netdata/go.d.plugin/pull/1039) the replication lag calculation of the postgreSQL collector. - @vobruba-martin for [adding](https://github.com/netdata/netdata/pull/14330) `|nowarn` and `|noclear` notification modifiers to agent notifications. - @ghanapunq for adding [PCIe bandwidth utilization metrics](https://github.com/netdata/netdata/pull/14315) to NVIDIA GPU monitoring. - @Kerleyark and @mikerenfro for [verifying](https://github.com/netdata/netdata/pull/14099) that the CSV format significantly improves the performance of the NVIDIA GPU collector. - @martindue for [fixing the negative temperatures bug](https://github.com/netdata/netdata/pull/14435) in 1-wire sensor monitoring. ## New Contributors * @je2555 made their first contribution in https://github.com/netdata/netdata/pull/14153 * @ghanapunq made their first contribution in https://github.com/netdata/netdata/pull/14315 * @martindue made their first contribution in https://github.com/netdata/netdata/pull/14435 **Full Changelog**: https://github.com/netdata/netdata/compare/v1.37.0...v1.38.0 2023-02-06T16:05:20+00:00 netdata v1.38.1 netdata v1.38.1 2023-02-13T16:28:44+00:00 The first patch release for v1.38 [updates the version of OpenSSL](https://github.com/netdata/netdata/pull/14450) included in our static builds and Docker images to `v1.1.1t`, to resolve a [few moderate security vulnerabilities](https://www.openssl.org/news/vulnerabilities-1.1.1.html) in `v1.1.1n`. The patch also includes the following minor bug fixes: - [We fixed the handling of dimensions with no data in a specific timeframe](https://github.com/netdata/netdata/pull/14447). When the metrics registry recorded a dimension as present in a specific timeframe, but the dimension did not have any data for that timeframe, the query engine would return random data that happened to be in memory. - [We fixed occasional crashes during shutdown when not using eBPF](https://github.com/netdata/netdata/pull/14470). - [We fixed the systemd service file handling on systems using a systemd version older than v235](https://github.com/netdata/netdata/pull/14471). - We fixed build failures on [FreeBSD 14 release candidates](https://github.com/netdata/netdata/pull/14446), [FreeBSD < 13.1](https://github.com/netdata/netdata/pull/14430), and [environments with Linux kernel version < 5.11](https://github.com/netdata/netdata/pull/14430). ## Support options <a id="vXXXX-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1300 engineers are already using it! 2023-02-13T16:28:44+00:00 netdata v1.39.0 netdata v1.39.0 2023-05-08T14:49:58+00:00 - [Netdata open-source growth](#v1390-netdata-open-source-growth) - [Release highlights](#v1390-release-highlights) - **[Netdata Charts v3.0](#v1390-netdata-charts-v30)**<br/> A new era for monitoring charts. Powerful, fast, easy to use. Instantly understand the dataset behind any chart. Slice, dice, filter and pivot the data in any way possible! - **[Windows support](#v1390-windows-support)**<br/> Windows hosts are now first-class citizens. You can now enjoy out-of-the-box monitoring of over 200 metrics from your Windows systems and the services that run on them. - [Virtual nodes and custom labels](#v1390-virtual-nodes-and-custom-labels)<br/> You now have access to more monitoring superpowers for managing medium to large infrastructures. With custom labels and virtual hosts, you can easily organize your infrastructure and ensure that troubleshooting is more efficient. - [Major upcoming changes](#v1390-major-upcoming-changes)<br/> Separate packages for data collection plugins, mandatory `zlib`, no upgrades of existing installs from versions prior to v1.11. - [Bar charts for functions](#v1390-bar-charts-for-functions) - [Opsgenie notifications for Business Plan users](#v1390-opsgenie-notifications-for-business-plan-users)<br/> Business plan users can now seamlessly integrate Netdata with their Atlassian Opsgenie alerting and on call management system. - [Data Collection](#v1390-data-collection) - [Containers and VMs CGROUPS](#v1390-containers-and-vms-cgroups) - [Docker](#v1390-docker) - [Kubernetes](#v1390-kubernetes) - [Kernel traces/metrics eBPF](#v1390-kernel-tracesmetrics-ebpf) - [Disk Space Monitoring](#v1390-disk-space-monitoring) - [OS Provided Metrics proc.plugin](#v1390-os-provided-metrics-procplugin) - [PostgreSQL](#v1390-postgresql) - [DNS Query](#v1390-dns-query) - [HTTP endpoint check](#v1390-http-endpoint-check) - [Elasticsearch and OpenSearch](#v1390-elasticsearch-and-opensearch) - [Dnsmasq DNS Forwarder](#v1390-dnsmasq-dns-forwarder) - [Envoy](#v1390-envoy) - [Files and directories](#v1390-files-and-directories) - [RabbitMQ](#v1390-rabbitmq) - [charts.d.plugin](#v1390-chartsdplugin) - [Anomalies](#v1390-anomalies) - [Generic structured data with Pandas](#v1390-generic-structured-data-pandas) - [Generic Prometheus collector](#v1390-generic-prometheus-collector) - [Alerts and Notifications](#v1390-alerts-and-notifications) - [Notifications](#v1390-notifications) - [Improved email alert notifications](#v1390-improved-email-alert-notifications) - [Receive only notifications for unreachable nodes](#v1390-receive-only-notifications-for-unreachable-nodes) - [ntfy agent alert notifications](#v1390-ntfy-agent-alert-notifications) - [Enhanced Real-Time Alert Synchronization on Netdata Cloud](#v1390-enhanced-real-time-alert-synchronization-on-netdata-cloud) - [Visualizations / Charts and Dashboards](#v1390-visualizations--charts-and-dashboards) - [Events Feed](#v1390-events-feed) - [Machine Learning](#v1390-machine-learning) - [Installation and Packaging](#v1390-installation-and-packaging) - [Improved Linux compatibility](#v1390-improved-linux-compatibility) - [Administration](#v1390-administration) - [New way to retrieve netdata.conf](#v1390-new-way-to-retrieve-netdataconf) - [Documentation and Demos](#v1390-documentation-and-demos) - [Deprecation notice](#v1390-deprecation-notice) - [Deprecated in this release](#v1390-deprecated-in-this-release) - [Netdata Agent Release Meetup](#v1390-netdata-agent-release-meetup) - [Support options](#v1390-support-options) - [Running survey](#v1390-running-survey) - [Acknowledgements](#v1390-acknowledgements) ## Netdata open-source growth <a id="v1390-netdata-open-source-growth"></a> <!-- Retrieve most of these stats from netdata/netdata/README.md badges --> - Over 62,000 GitHub Stars - Over 1.5 million online nodes - Almost 92 million sessions served - Over 600 thousand total nodes in Netdata Cloud ## Release highlights <a id="v1390-release-highlights"></a> ### Netdata Charts v3.0 <a id="v1390-netdata-charts-v30"></a> We are excited to announce Netdata Charts v3.0 and the NIDL framework. These are currently available at Netdata Cloud. At the next Netdata release, the agent dashboard will be replaced to also use the same charts. One of the key obstacles in understanding an infrastructure and troubleshooting issues, is making sense of the data we see on charts. Most monitoring solutions assume that the users have a deep understanding of the underlying data, so during visualization they actually do nothing to help users comprehend the data easier or faster. The problem becomes even more apparent when the users troubleshooting infrastructure problems are the not the ones who developed the dashboards. In these cases all kinds of misunderstandings are possible, resulting in bad decisions and slower time to resolution. To help users instantly understand and validate the data they see on charts, we developed the NIDL (Nodes, Instances, Dimensions, Labels) framework and we changed all the Netdata query engines, at both the agent and the cloud, to enrich the returned data with additional information. This information is then visualized on all charts. ![image](https://user-images.githubusercontent.com/43294513/236503200-b1c60298-d6d9-49bb-8927-ede3c78e25e3.png) <details> <summary>Click to read more about the changes</summary> #### Embedded Machine Learning for every metric Netdata's unsupervised machine learning algorithm creates a unique model for each metric collected by your agents, using exclusively the metric's past data. We don't train ML models on a lab, or on aggregated sample data. We then use these unique models during data collection to predict the value that should be collected and check if the collected value is within the range of acceptable values based on past patterns and behavior. If the value collected is an outlier, we mark it as anomalous. This unmatched capability of real-time predictions as data is collected allows you to **detect anomalies for potentially millions of metrics across your entire infrastructure within a second of occurrence**. Before this release, users had to either go to the "Anomalies" tab, or enable anomaly rate information from a button on the charts to access the anomaly rate. We found that this was not very helpful, since a lot of users were not aware of this functionality, or they were forgetting to check it. So, we decided that the best use of this information is to visualize it by default on all charts, so that users will instantly see if the AI algorithm in Netdata believes the values are not following past behavior. In addition to the summarized tables and chart overlay, a new anomaly rate ribbon on top of each chart visualizes the combined anomaly rate of all the underlying data, highlighting areas of interest that may not be easily visible to the naked eye. Hovering over the anomaly rate ribbon provides a histogram of the anomaly rates per dimension presented, for the specific point in time. <img src="https://user-images.githubusercontent.com/43294513/235494042-2c2a0c17-3681-4709-8f39-577e23aebfe6.png" width=500/> Anomaly rate visualization does not make Netdata slower. Anomaly rate is saved in the the Netdata database, together with metric values, and due to the smart design of Netdata, it does not even incur a disk footprint penalty. #### Introducing chart annotations for comprehensive context Chart annotations have arrived! When hovering over the chart, the overlay may display an indication in the "Info" column. Currently, annotations are used to inform users of any data collection issues that might affect the chart. Below each chart, we added an information ribbon. This ribbon currently shows 3 states related to the points presented in the chart: 1. **[P]: Partial Data** At least one of the dimensions in the chart has partial data, meaning that not all instances available contributed data to this point. This can happen when a container is stopped, or when a node is restarted. This indicator helps to gain confidence of the dataset, in situations when unusual spikes or dives appear due to infrastructure maintenance, or due to failures to part of the infrastructure. 2. **[O]: Overflowed** At least one of the datasources included in the chart was a counter that has overflowed exactly that point. 3. **[E]: Empty Data** At least one of the dimensions included in the chart has no data at all for the given points. All these indicators are also visualized per dimension, in the pop-over that appears when hovering the chart. <img src="https://user-images.githubusercontent.com/43294513/235477397-753941a6-66ae-4979-b982-dfe3bd9ab598.png" width=600/> #### New hover pop-over Hovering over any point in the chart now reveals a more informative overlay. This includes a bar indicating the volume percentage of each time series compared to the total, the anomaly rate, and a notification if there are data collection issues (annotations from the info ribbon). The pop-over sorts all dimensions by value, makes bold the closest dimension to the mouse and presents a histogram based on the values of the dimensions. <img src="https://user-images.githubusercontent.com/43294513/235470493-49cf7890-5d21-4aca-9e8d-5d0874d56389.png" width=600/> When hovering the anomaly ribbon, the pop-over sorts all dimensions by anomaly rate, and presents a histogram of these anomaly rates. #### NIDL framework You can now rapidly access condensed information for collected metrics, grouped by node, monitored instance, dimension, or any label key/value pair. Above all charts, there are a few drop-down menus. These drop-down menus have 2 functions: 1. Provide additional information about the visualized chart, to help us understand the data we see. 2. Provide filtering and grouping capabilities, altering the query on the fly, to help us get different views of the dataset. In this release, we extended the query engines of Netdata (both at the agent and the cloud), to include comprehensive statistical data to help us understand what we see on the chart. We developed the NIDL framework to standardize this presentation across all charts. The NIDL framework attaches the following metadata to every metric we collect: 1. The Node each metric comes from 2. The Instance each metric belongs to. An instance can be container, a disk, a network interface, a database server, a table in a given data server, etc. The instance describes which exactly component of our infrastructure we monitor. At the charts, we replaced the word "instance" with the proper name of that instance. So, when the instance is a disk, we see "disks". When it is a container, we see "containers", etc. 3. The Dimensions are the individual metrics related to an instance under a specific context. 4. The Labels are all the labels that are available for each metric, that many or may not be related to the node or the instance of them metric. Since all our metrics now have these metadata, we are use them at query time, to provide for each of them the following consolidated data for the visible time frame: 1. The volume contribution of each of them into the final query. So even if a query comes from 1000 nodes, we can instantly see the contribution of each node in the result. The same for instances, dimensions and labels. Especially for labels, Netdata also provides the volume contribution of each label `key:value` pair to the final query, so that we can immediately see for all label values involved in the query how much they affected the chart. 2. The anomaly rate of each of them for the time-frame of the query. This is used to quickly spot which of the nodes, instances, dimensions or labels have anomalies in the requested time-frame. 3. The minimum, average and maximum values of all the points used for the query. This is used to quickly spot which of the nodes, instances, dimensions or labels are responsible for a spike or a dive in the chart. All of these drop-down menus can now be used for instantly filtering the dataset, by including or excluding specific nodes, instances, dimensions or labels. Directly from the drop-down menu, without the need to edit a query string and without any additional knowledge of the underlying data. ![image](https://user-images.githubusercontent.com/43294513/235470150-62a3b9ac-51ca-4c0d-81de-8804e3d733eb.png) #### Multiple Group-by At the same time, the new query engine of Netdata has been enhanced to support multiple group-by at once. The "Group by" drop-down menu allows selecting 1 or more groupings to be applied at once on the same dataset. Currently it supports: 1. Group by Node, to summarize the data of each node, and provide one dimension on the chart for each of the nodes involved. Filtering nodes is supported at the same time, using the nodes drop-down menu. 2. Group by Instance, to summarize the data of each instance and provide one dimension on the chart for each of the instances involved. Filtering instances is supported at the same time, using the instances drop-down menu. 3. Group by Dimension 4. Group by Label, to summarize the data for each label value. Multiple label keys can be selected at the same time. Using this menu, you can slice and dice the data in any possible way, to quickly get different views of them, without the need to edit a query string and without any need to better understand the format of the underlying data. Netdata will do its by itself. <img src="https://user-images.githubusercontent.com/43294513/235468819-3af5a1d3-8619-48fb-a8b7-8e8b4cf6a8ff.png" width=800/> </details> ### Windows support <a id="v1390-windows-support"></a> We are excited to announce that our Windows monitoring capabilities have been greatly improved with the addition of over 170 new system, network, and application metrics. This includes out-of-the-box support for MS Exchange, MS SQL, IIS, Active Directory (including AD Certificate and AD Federation Services). To try out Netdata directly on your Windows machine, our `.msi` [installer](https://github.com/netdata/msi-installer) allows for quick and easy installation with a Netdata WSL distribution. However, for production deployments, one or more Linux nodes are still required to run Netdata and store your metrics, as shown in the provided diagram. ![windows](https://user-images.githubusercontent.com/43294513/232522572-1fe51228-953b-43d2-81c4-dcb0a8974db5.jpg) To fully support this architecture, we have added the ability to declare each Windows host as a Netdata node. You can learn more about this feature in the [virtual nodes](#org) section. <details> <summary>Click to read more about the changes</summary> We continue to rely on the native [Prometheus node_exporter](https://github.com/prometheus/node_exporter) running on each Windows VM or physical machine, as we support a FOSS ecosystem with the best-in-class software for each role. Although node_exporter does not support per-second resolution like Netdata offers in Linux systems, we are working towards providing the ability to install Netdata directly on Windows machines without the need for containers or dedicated Linux VMs. In the meantime, we believe that our suggested architecture allows for quick and easy monitoring of Windows systems and applications. For more information, please check out our high-level introduction to [Windows monitoring](https://www.netdata.cloud/windows-monitoring/), our [demo](https://app.netdata.cloud/spaces/netdata-demo/rooms/windows/overview), or our [Windows collector documentation](https://learn.netdata.cloud/docs/data-collection/monitor-anything/system%20metrics/windows-machines/). </details> ### Virtual nodes and custom labels <a id="v1390-virtual-nodes-and-custom-labels"></a> Netdata provides powerful tools for organizing hundreds of thousands of metrics collected every second in large infrastructures. From the automated organization into sections of related out-of-the-box aggregate charts, to concepts like spaces and war rooms that connect the metrics with the people who need to use them, scale is no problem. Easily slicing and dicing the metrics via grouping and filtering in our charts is also essential for exploration and troubleshooting, which is why we in the past we introduced host labels and default metric labels. To complete the available tool set, Netdata now offers the ability to define custom metric labels and virtual nodes. You can read how everything fits together in [our documentation](https://learn.netdata.cloud/docs/deployment-in-production/organize-systems-metrics-and-alerts). You can use custom labels to group and filter metrics in the Netdata Cloud aggregate charts. Virtual nodes work like normal Netdata Cloud nodes for the metrics you assign to them and can be added to any room. <details> <summary>Click to read more about the changes</summary> Custom metric labels let you group or filter by dimensions that make sense to you and your team. This feature has been available since v1.36.0, but was previously considered experimental and not well-documented. We have now made it extremely simple to [add custom labels](https://learn.netdata.cloud/docs/deployment-in-production/organize-systems-metrics-and-alerts) to `go.d.plugin` data collection jobs. The ability to define a virtual node is a new feature that is essential for monitoring remote Windows hosts, but has many other potential uses. For example, you may have a central monitoring node collecting data from many remote database hosts that you aren't allowed to install software on. You may also use the [HTTP endpoint collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/networking/http-endpoints/) to check the availability and latency of APIs on multiple remote endpoints. Defining virtual nodes lets you substantiate those entities that have no Netdata running on them, so they can appear in Netdata Cloud, be placed in rooms, filtered and grouped easily, and have their virtual node name displayed in alerts. Learn how to [configure virtual nodes](https://learn.netdata.cloud/docs/deployment-in-production/organize-systems-metrics-and-alerts#virtual-nodes) for any go.d.plugin data collection job. </details> ### Major upcoming changes <a id="v1390-major-upcoming-changes"></a> Please read carefully through the following planned changes in our packaging, support of existing installs and required dependencies, as they may impact you. We are committed to providing the most up-to-date and reliable software, and we believe that the changes outlined below will help us achieve this goal more efficiently. As always, we are happy to provide any assistance needed during this transition. <details> <summary>Click to read more about the changes</summary> #### Upcoming collector packaging changes As previously discussed on our blog, we will be changing how we package our external data collection plugins in the coming weeks. This change will be reflected in nightly builds a few weeks after this release, and in stable releases starting with v1.40.0. Please note that any patch releases for v1.39.0 will _not_ include this change. For detailed information on this change and how it may impact you, please refer to our blog post titled [Upcoming Changes to Plugins in Native Packages](https://blog.netdata.cloud/split-plugin-packages/). #### Upcoming end of support for upgrading very old installs Beginning shortly after this release, we will no longer be providing support for upgrading existing installs from versions prior to Netdata v1.11.0. It is highly unlikely that this change will affect any existing users, as v1.11.0 was released in 2018. However, this change is important in the long-term, as it will allow us to make our installer and updater code more portable. #### Upcoming mandatory dependency on zlib In the near future, we will be making a significant change to the Netdata agent by making `zlib` a mandatory dependency. Although we have not treated it as a mandatory dependency in the past, a number of features that we consider core parts of the agent rely on `zlib`. Given that `zlib` is ubiquitous across almost every platform, there is little to no benefit to it being an optional dependency. As such, this change is unlikely to have a significant impact on the vast majority of our users. The change will be implemented in nightly builds shortly after this release and in stable releases starting with v1.40.0. Please note that any patch releases for v1.39.0 will _not_ include this change. </details> ### Bar charts for functions <a id="v1390-bar-charts-for-functions"></a> In v1.38, we introduced real-time [functions](https://github.com/netdata/netdata/releases/tag/v1.38.0#v1380-functions) that enable you to trigger specific routines to be executed by a given Agent on demand. Our initial function provided detailed information on currently running processes on the node, effectively replacing `top` and `iotop`. We have now expanded the versatility of functions by incorporating configurable bar charts above the table displaying detailed data. These charts will be a standard feature in all future functions, granting you the ability to manipulate and analyze the retrieved data as needed. <img width="1200" alt="image" src="https://user-images.githubusercontent.com/43294513/235497989-f5ebe5e2-bbe6-499d-93de-e8eb86430506.png"> ### Opsgenie notifications for Business Plan users <a id="v1390-opsgenie-notifications-for-busniess-plan-users"></a> Ensuring the reliable delivery of alert notifications is crucial for maintaining the reliability of your services. While individual Netdata agents were already able to send alert notifications to Atlassian's Opsgenie, Netdata Cloud adds centralized control and more robust retry and failure handling mechanisms to improve the reliability of the notification delivery process. Business Plan users can now configure Netdata Cloud to send alert notifications to their Atlassian Opsgenie platform, using our centralized alert dispatching feature. This feature helps to ensure the reliable delivery of notifications, even in cases where individual agents are offline or experiencing issues. <img src="https://user-images.githubusercontent.com/43294513/234079607-f8988f42-5711-4e17-a67b-6ffa9aa7ded3.png" width=500/> We are committed to continually extending the capabilities of Netdata Cloud, and our focus on centralized alert dispatching is just one example of this. By adding more centralized dispatching options, we can further increase the reliability of notification delivery and help our users maintain the highest levels of service reliability possible. ## Data Collection <a id="v1390-data-collection"></a> ### Containers and VMs (CGROUPS) <a id="v1390-containers-and-vms-cgroups"></a> The [cgroups plugin](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Virtualized%20Environments/Containers-and-VMs-cgroups) reads information on Linux Control Groups to monitor containers, virtual machines and systemd services. Previously, we identified individual Docker containers solely through their container ID, which may not always provide adequate information to identify potential issues with your infrastructure. However, we've made significant improvements to our system by incorporating labels containing the [image](https://github.com/netdata/netdata/pull/14872) and the [name](https://github.com/netdata/netdata/pull/14856) of each container to all the collected metrics. These features allows you to group and filter the containers in a more efficient and effective manner, enabling you to quickly pinpoint and troubleshoot any issues that may arise We always strive to provide the most informative chart titles and descriptions. The title of all our container CPU usage charts explain that 100% utilization means 1 CPU core, which also means you can exceed 100% when you add the utilization of multiple cores. This logic is a bit foreign to Kubernetes monitoring, where `mCPU` is clearer. So we [modified the chart title](https://github.com/netdata/netdata/pull/14791) to state that 100% utilization is equivalent to 1000 mCPUs. We place great importance on delivering the most informative chart titles and descriptions to our users. Our container CPU usage charts are no exception. We understand that the concept of 100% CPU utilization equating to 1 CPU core, and the ability to exceed 100% by adding the utilization of multiple cores may seem a bit unfamiliar to those using Kubernetes monitoring. In light of this, we have taken steps to modify our chart title by [incorporating mCPU](https://github.com/netdata/netdata/pull/14791), which provides greater clarity. The title now indicates that 100% utilization equates to 1000 mCPUs in k8s. We hope this change will help you better understand and interpret our container CPU usage charts. ### Docker <a id="v1390-docker"></a> Netdata [monitors the Docker engine](https://learn.netdata.cloud/docs/data-collection/monitor-anything/virtualized-environments/docker) to automatically generate charts for container health and state, and image size and state. Previously, this collector only retrieved aggregate metrics for the containers managed by the Docker engine. We started a major change in the way we collect metrics from Docker so that [we can now present the health of each container separately](https://github.com/netdata/go.d.plugin/pull/1148), or grouped by the container name and image labels. Some teething issues with this change were fixed quickly with [#1160](https://github.com/netdata/go.d.plugin/pull/1160). We recently increased the client version of our collector, which started causing issues with older Docker engine servers. We resolved these issues by [adding client version negotiation](https://github.com/netdata/go.d.plugin/pull/1136) to our Docker collector. ### Kubernetes <a id="v1390-kubernetes"></a> Monitoring Kubernetes clusters can be challenging due to the intricate nature of the infrastructure. Identifying crucial aspects to monitor necessitates considerable expertise, which Netdata provides out-of-the-box through dedicated collectors for every layer of your Kubernetes infrastructure. One key area to keep an eye on is the overall cluster state, which we address using the [Kubernetes Cluster State Collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/virtualized-environments/kubernetes-cluster-state). This collector generates automated dashboards for 37 metrics encompassing overall node and pod resource limits and allocations, as well as pod and container readiness, health, and container restarts. Initially, we displayed the rate of container restarts, as we did with numerous other events. However, restarts are infrequent occurrences in many infrastructures. Displaying the rate of sparse events can lead to suboptimal charts for troubleshooting purposes. To address this, we have modified the logic and now [present the absolute count of container restarts](https://github.com/netdata/go.d.plugin/pull/1141) for enhanced clarity. Kubernetes monitoring also relies on the [cgroups plugin](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Virtualized%20Environments/Containers-and-VMs-cgroups) for container and pod monitoring. To properly label k8s containers, the `cgroup` plugin makes calls to the k8s API server to retrieve pod metadata. In large clusters and under certain conditions (e.g. starting all the agents at once), these requests can potentially cause serious stress on the API server, or even a denial of service incident. To address this issue we have provided an alternative to querying the API server. We [now allow querying the local kubelet server](https://github.com/netdata/netdata/pull/14891) for the same information. However, since the Kubelet's `/pods` endpoint is not well documented and should probably not be relied on (see [1](https://github.com/fluent/fluent-bit/issues/1948#issuecomment-680291735), [2](https://github.com/fluent/fluent-bit/issues/1948#issuecomment-691219053)), we still query the API server by default. To switch to querying Kubelet, you can set the `child.podsMetadata.useKubelet` and `child.podsMetadata.kubeletUrl` variables that [were added to our Helm chart](https://github.com/netdata/helmchart/pull/349). ### Kernel traces/metrics (eBPF) <a id="v1390-kernel-traces-metrics-ebpf"></a> The [eBPF Collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/system-metrics/kernel-traces-metrics-ebpf) offers numerous eBPF programs to assist you in troubleshooting and analyzing how applications interact with the Linux kernel. By utilizing [tracepoints, trampoline, and kprobes](#how-netdata-collects-data-using-probes-and-tracepoints), we gather a wide range of valuable data about the host that would otherwise be unattainable. We recently addressed some significant issues with SIGABRT crashes on some systems. These crashes were caused by problems with memory allocation and deallocation functions, which resulted in unstable system behavior and prevented users from effectively monitoring their systems. To resolve these issues, we [made some significant changes](https://github.com/netdata/netdata/pull/14591) to our memory allocation and deallocation functions. Specifically, we replaced these functions with more reliable alternatives and began using vector allocation where possible. We later identified issues with memory corruption, `Oracle Linux` ported codes and `OOMKill`, which were all resolved with [#14869](https://github.com/netdata/netdata/pull/14869). Finally, issues with CPU usage on EC2 instances appeared in a nightly release and were resolved with [some changes](https://github.com/netdata/netdata/pull/14902) that speed up the plugin clean up process and also prevent some possible `SIGABRT` and `SIGSEGV` crashes. These changes helped to reduce the likelihood of crashes occurring and improved the overall stability and reliability of the eBPF collector. In some environments, the collector demanded substantial memory resources. To address this, we introduced [charts to monitor its memory usage](https://github.com/netdata/netdata/pull/14623) and implemented initial optimizations to [decrease the RAM requirements](https://github.com/netdata/netdata/pull/14462). We will continue this work in future releases, to bring you even more eBPF observability superpowers, with minimal resource needs. ### Disk Space Monitoring <a id="v1390-disk-space-monitoring"></a> The [disk space plugin](https://learn.netdata.cloud/docs/data-collection/monitor-anything/system-metrics/disks) is designed to monitor disk space usage and inode usage for mounted disks in Linux. However, because `msdos`/`FAT` file systems don't use inodes, the plugin would often generate false positives, leading to inaccurate results. To fix this, we've [disabled inode data collection for these file systems](https://github.com/netdata/netdata/pull/14809), using the `exclude inode metrics on filesystems` configuration option. This option has a default value of `msdosfs msdos vfat overlayfs aufs* *unionfs`. ### OS Provided Metrics (proc.plugin) <a id="v1390-os-provided-metrics-proc-plugin"></a> Our [proc plugin](https://learn.netdata.cloud/docs/data-collection/monitor-anything/system-metrics/os-provided-metrics-proc.plugin) is responsible for gathering system metrics from various endpoints, including `/proc` and `/sys` folders in Linux systems. It is an essential part of our monitoring tool, providing insights into system performance. When running the Netdata agent in a Docker container, we encountered an issue where `zram` memory metrics were not being displayed. To solve this, [we made changes](https://github.com/netdata/netdata/pull/14759) to the `zram` collector code, respecting the `/host` prefix added to the directories mounted from the host to the container. Now, our monitoring tool can collect `zram` memory metrics even when running in a Docker container. We also improved the `zfs` storage pool monitoring code, by [adding](https://github.com/netdata/netdata/pull/14934) the state `suspended` to the list of monitored states. Finally, [we added](https://github.com/netdata/netdata/pull/14636) new metrics for BTRFS commits and device errors. ### PostgreSQL <a id="v1390-postgresql"></a> Our [PostgreSQL collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Databases/PostgresSQL) is a highly advanced application collector, offering 70 out-of-the-box charts and 14 alerts to help users monitor their PostgreSQL databases with ease. We recently discovered an issue in our documentation where we were instructing users to create a `netdata` user, even though our data collection job was using the `postgres` user. To address this issue, we [have now added](https://github.com/netdata/go.d.plugin/pull/1104) the `netdata` user as an additional option to our data collection jobs. With this enhancement, users can now use either the `postgres` user or the newly added `netdata` user to collect data from their PostgreSQL databases, ensuring a more seamless and accurate monitoring experience. Netdata automatically generates several charts for PostreSQL write-ahead logs (WAL). We recently discovered that `wal_files_count`, `wal_archiving_files_count` and `replication_slot_files_count` require superuser access, so we [added a check](https://github.com/netdata/go.d.plugin/pull/1122) on whether the collection job has superuser access, before attempting to collect these WAL metrics. Finally, [we fixed a bug with the bloat size calculation](https://github.com/netdata/go.d.plugin/pull/1094) that used to erroneously return zeroes for some indexes. ### DNS Query <a id="v1390-dns-query"></a> The [DNS query collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Dns/DNS-queries) is a crucial tool that ensures optimal system performance by monitoring the liveness and latency of DNS queries. This tool is simple yet essential, as it attempts to resolve any hostname you provide and creates metrics for the response time and success or failure of each request/response. Previously, we only measured the response time for successful queries. However, we have now enhanced the DNS query collector by [collecting latency data for failed queries as well](https://github.com/netdata/go.d.plugin/pull/1112). This improvement enables us to identify and troubleshoot DNS errors more effectively, which ultimately leads to improved system reliability and performance. ### HTTP endpoint check <a id="v1390-http-endpoint"></a> Modern endpoint monitoring should include periodic checks on all your internal and public web applications, regardless of their traffic patterns. Automated and continuous tests can proactively identify issues, allowing them to be resolved before any users are affected. Netdata's [HTTP endpoint collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/networking/http-endpoints/) is a powerful tool that enables users to monitor the response status, latency, and content of any URL provided. While the collector has always supported basic authentication via a provided username and password, we have recently introduced a new enhancement that allows for more complex authentication flows. With the addition of the [ability to include a cookie in the request](https://github.com/netdata/go.d.plugin/pull/1133), users can now authenticate and monitor more advanced applications, ensuring more comprehensive and accurate monitoring capabilities. All you need to do is to add `cookie: <filename>` to your data collection job and the collector will issue the request will the contents of that file. ### Elasticsearch and OpenSearch <a id="v1390-elasticsearch-and-opensearch"></a> Our [Elasticsearch Collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/search/elasticsearch) seamlessly generates visualizations for 47 metrics, drawing from 4 endpoints of the renowned search engine. The original Elasticsearch project evolved into an open-source initiative called [OpenSearch](https://opensearch.org/), spearheaded by Amazon. However, our collector [did not automatically connect to OpenSearch instances](https://community.netdata.cloud/t/opensearch-elasticsearch-alternative-does-not-appear-automatically/3995) due to their default security settings with TLS and authentication. Although it is possible to disable security by adding `plugins.security.disabled: true` to `/etc/opensearch/opensearch.yml`, which allows the default data collection job to function, we deemed it more prudent to [introduce an OpenSearch-specific data collection job](https://github.com/netdata/go.d.plugin/pull/1140). This addition explicitly enables TLS and highlights the necessity of a username and password for secure access. ### Dnsmasq DNS Forwarder <a id="v1390-dnsmasq-dns-forwarder"></a> [`Dnsmasq`](http://www.thekelleys.org.uk/dnsmasq/doc.html) is a lightweight and easy-to-configure DNS forwarder that is specifically designed to offer DNS, DHCP, and TFTP services to small-scale networks. Netdata provides comprehensive monitoring of `Dnsmasq` by collecting metrics for both the [DHCP server](https://learn.netdata.cloud/docs/data-collection/monitor-anything/dns/dnsmasq-dhcp) and [DNS forwarder](https://learn.netdata.cloud/docs/data-collection/monitor-anything/dns/dnsmasq-dns-forwarder). Recently, we made a minor but important improvement to the order in which the DNS forwarder cache charts are displayed. With this update, the most critical information regarding cache utilization [is now presented first](https://github.com/netdata/go.d.plugin/pull/1125), providing users with more efficient access to essential data. By constantly improving and refining our monitoring capabilities, we aim to provide our users with the most accurate and useful insights into their network performance. ### Envoy <a id="v1390-envoy"></a> [Envoy](https://www.envoyproxy.io/docs/envoy/latest/intro/what_is_envoy) is an L7 proxy and communication bus designed for large modern service oriented architectures. Our [new](https://github.com/netdata/go.d.plugin/pull/1142) [Envoy collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Proxies/Envoy) automatically generates charts for over 50 metrics. ### Files and directories <a id="v1390-files-and-directories"></a> The [files and directories collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/system-metrics/files-and-dirs) monitors existence, last update and size of any files or directories you specify. The collector was not sanitizing file and directory names, causing issues with metric collection. The issue was specific to paths with spaces in them and is [now fixed](https://github.com/netdata/go.d.plugin/pull/1158). ### RabbitMQ <a id="v1390-rabbitmq"></a> The Netdata agent includes a [RabbitMQ collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/rabbitmq) that tracks the performance of this open-source message broker. This collector queries RabbitMQ's HTTP endpoints, including `overview`, `node`, and `vhosts`, to provide you with detailed metrics on your RabbitMQ instance. Recently, [we fixed an issue](https://github.com/netdata/go.d.plugin/pull/1114) that prevented our collector from properly collecting metrics on 32-bit systems. ### charts.d.plugin <a id="v1390-charts-d-plugin"></a> The [charts.d plugin](https://learn.netdata.cloud/docs/improving-netdata---developers/external-plugins/charts.d.plugin/) is an external plugin for Netdata. It's responsible for orchestrating data collection modules written in `BASH` v4+ to gather and visualize metrics. Recently, we fixed an issue with the plugin's restarts that sometimes caused the connection to Netdata to be lost. Specifically, there was a chance for charts.d processes to die at the exact same time when the Netdata binary tried to read from them using `fgets`. This caused Netdata to hang, as `fgets` never returned. To fix this issue, [we added](https://github.com/netdata/netdata/pull/14680) a "last will" `EOF` to the exit process of the plugin. This ensures that the fgets call has something to receive before the plugin exits, preventing Netdata from hanging. With this issue resolved, the charts.d plugin can now continue to provide seamless data collection and visualization for your Netdata instance without any disruptions. ### Anomalies <a id="v1390-anomalies"></a> Our [anomaly collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/any-application---generic-collectors/anomalies) is a powerful tool that uses the `PyOD` library in Python to perform unsupervised anomaly detection on your Netdata metrics. With this collector, you can easily identify unusual patterns in your data that might indicate issues with your system or applications. Recently, we discovered an issue with the collector's Python version check. Specifically, the check was incorrectly rejecting Python 3.10 and higher versions due to how the `float()` function was casting "10" to "1". This resulted in an inaccurate check that prevented some users from using the anomaly collector with the latest versions of Python. To resolve this issue, [we fixed the Python version check](https://github.com/netdata/netdata/pull/14616) to work properly with Python 3.10 and above. With this fix in place, all users can now take advantage of the anomaly collector's powerful anomaly detection capabilities regardless of the version of Python they are using. ### Generic structured data (Pandas) <a id="v1390-generic-structured-data-pandas"></a> [Pandas](https://pandas.pydata.org/) is a de-facto standard in reading and processing most types of structured data in Python. If you have metrics appearing in a CSV, JSON, XML, HTML, or [other supported format](https://pandas.pydata.org/docs/user_guide/io.html), either locally or via some HTTP endpoint, you can easily ingest and present those metrics in Netdata, by leveraging the [Pandas collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/any-application---generic-collectors/structured-data-pandas). We [fixed an issue](https://github.com/netdata/netdata/pull/14736) we had logging some collector errors. ### Generic Prometheus collector <a id="v1390-generic-prometheus-collector"></a> Our [Generic Prometheus Collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus) gathers metrics from any [Prometheus](https://prometheus.io/) endpoint that uses the [OpenMetrics exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/). In version 1.38, we made some significant changes to how we generate charts with labels per label set. [These changes](https://github.com/netdata/go.d.plugin/pull/1004) resulted in a drastic increase in the length of generated chart IDs, which posed some challenges for users with a large number of label key/value pairs. In some cases, the length of the type.id` string could easily exceed the previous limit of 200 characters, which prevented users from effectively monitoring their systems. To resolve this issue, we took action to [increase the chart ID limit](https://github.com/netdata/go.d.plugin/pull/1126) from 200 to 1000 characters. This change provides you with more flexibility when it comes to labeling their charts and ensures that you can effectively monitor their systems regardless of the number of label key/value pairs you use. ## Alerts and Notifications <a id="v1390-alerts-and-notifications"></a> ### Notifications <a id="v1390-notifications"></a> #### Improved email alert notifications <a id="v1390-improved-email-alert-notifications"></a> We recently made some significant improvements to our email notification templates. These changes include adding the chart context, Space name, and War Room(s) with navigation links. We also updated the way the subject is built to ensure it's consistent with our other templates. These improvements help to provide users with more context around their alert notifications, making it easier to quickly understand the nature of the issue and take appropriate action. By including chart context, Space name, and War Room(s) information, users can more easily identify the source of the problem and coordinate a response with their team members. <img src="https://user-images.githubusercontent.com/43294513/235459277-bf94f66c-4856-4a64-8c66-f2aa087317e6.png" width=500/> #### Receive only notifications for unreachable nodes <a id="v1390-receive-only-notifications-for-unreachable-nodes"></a> We've also enhanced our personal notification level settings to include an "Unreachable only" option. This option allows you to receive only reachability notifications for nodes disconnected from Netdata cloud. Previously this capability was only available combined with "All alerts". With this enhancement, you can now further customize you notification settings to more effectively manage your alerts and reduce notification fatigue. <image src="https://user-images.githubusercontent.com/43294513/234083086-87c92f93-0c5d-49a2-bfe0-e1c27faa02f1.png" width=500/> #### ntfy agent alert notifications <a id="v1390-ntfy-agent-alert-notifications"></a> The Netdata agent can now send alerts to [ntfy](https://ntfy.sh/) servers. `ntfy` (pronounced "notify") is a simple HTTP-based [pub-sub](https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern) notification service. It allows you to send notifications to your phone or desktop via scripts from any computer, entirely without sign-up, cost or setup. It's also [open source](https://github.com/binwiederhier/ntfy) if you want to run your own server. You can learn how to send `ntfy` alert notifications from a Netdata agent in [our documentation](https://learn.netdata.cloud/docs/alerts-and-notifications/notifications/agent-alert-notifications/ntfy). ### Enhanced Real-Time Alert Synchronization on Netdata Cloud <a id="v1390-enhanced-real-time-alert-synchronization-on-netdata-cloud"></a> Cloud to manage millions of alert state transitions daily. These transitions are transmitted from each connected agent through the agent-Cloud Link ([ACLK](https://learn.netdata.cloud/docs/deployment-in-production/secure-your-nodes/agent-cloud-link-aclk)). As with any communication channel, occasional data loss is unavoidable. Therefore, swiftly detecting missing transitions and reconciling discrepancies is crucial for maintaining real-time observability, regardless of scale. We are thrilled to introduce a significant enhancement to our alert synchronization protocol between Netdata Agents and Netdata Cloud. This upgrade ensures faster transmission of alert states and prompt resolution of any temporary inconsistencies. <details> <summary>Click to read more </summary> In the past, whenever a state transition change occurred, a message with a sequencing number was sent from the Agent to the Cloud. This method resulted in numerous read/write operations, generating excessive load on our Alerts database in the Cloud. Furthermore, it assumed that all messages had to be processed sequentially, imposing unnecessary constraints and restricting our scaling options for message brokers. ![image](https://user-images.githubusercontent.com/82235632/235124081-ad49d130-22da-4073-9998-24fa54b887ae.png) Our revamped protocol implements a far more efficient method. Instead of relying on sequencing numbers, we now use a checksum value calculated by both the Cloud and the Agent to verify synchronization. This approach not only lessens the burden on our Alerts database but also eliminates the dependency on sequential message processing, permitting out-of-order message delivery. ![image](https://user-images.githubusercontent.com/82235632/235124719-3cf87bae-3ede-4567-8785-b2f0e0044668.png) The enhanced synchronization and scaling capabilities allow us to address certain edge cases where users experienced out-of-sync alerts on the Cloud. Consequently, we can now deliver a superior service to our users. </details> ## Visualizations / Charts and Dashboards <a id="v1390-visualizations-charts-and-dashboards"></a> ### Events Feed <a id="v1390-events-feed"></a> We're committed to continually improving our [Events Feed](https://learn.netdata.cloud/docs/troubleshooting-and-machine-learning events-feed), which we introduced in version 1.38. We've made several user experience (UX) improvements to make the Events Feed even more useful for troubleshooting purposes. One of the key improvements we made was the addition of a bar chart showing the distribution of events over time. This chart helps users quickly identify interesting time periods to focus on during troubleshooting. By visualizing the distribution of events across time, users can more easily spot patterns or trends that may be relevant to their troubleshooting efforts. These improvements help to make the Events Feed an even more valuable tool, helping you troubleshoot issues more quickly and effectively. We will continue to explore ways to enhance the Events Feed and other features of our monitoring tool to provide the best possible user experience. ![image](https://user-images.githubusercontent.com/43294513/234084876-9df2451d-c6a9-49bc-bec2-cf882523481e.png) ## Machine Learning <a id="v1390-machine-learning"></a> As part of our [Machine Learning Roadmap](https://github.com/orgs/netdata/projects/54) we have been working to [persist trained models to the db](https://github.com/netdata/netdata/issues/14217) so that the models used in Netdata's [native anomaly detection capabilities](https://learn.netdata.cloud/docs/troubleshooting-and-machine-learning/machine-learning-ml-powered-anomaly-detection) will not be lost on restarts and instead be persisted to the database. This is an important step on the way to [extending the ML defaults](https://github.com/netdata/netdata/pull/14222) to train on the last 24 hours by default in the near future (as discussed more in [this blog post](https://blog.netdata.cloud/extending-anomaly-detection-training-window/)). This will help improve anomaly detection performance, reducing false positives and making anomaly rates more robust to system and netdata restarts where previously models would need to be fully re-trained. This is an area of quite active development right now and there are still a few more pieces of work to be done in coming releases. If interested you can follow along with any `area/ML` [issues in `netdata/netdata-cloud`](https://github.com/netdata/netdata-cloud/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fml) or [`netdata/netdata`](https://github.com/netdata/netdata/issues?q=is%3Aissue+is%3Aopen+label%3Aarea%2Fml) and check out active PR's [here](https://github.com/netdata/netdata/pulls?q=is%3Apr+is%3Aopen+label%3Aarea%2Fml). ## Installation and Packaging <a id="v1390-installation-and-packaging"></a> ### Improved Linux compatibility <a id="v1390-improved-linux-compatibilityr"></a> We have updated the bundled version of makeself used to create static builds, which was almost six years out of date, to sync it with the latest upstream release. This update should significantly improve compatibility on more exotic Linux systems. We have also updated the metadata embedded in the archive to better reflect the current state of the project. This ensures that the project is up to date and accurately represented, providing users with the most relevant and useful information. You can find more details about these updates in [our Github repository](https://github.com/netdata/netdata/pull/14822). ## Administration <a id="v1390-administration"></a> ### New way to retrieve netdata.conf <a id="v1390-new-way-to-retrieve-netdata-conf"></a> Previously, the only way to get a default `netdata.conf` file was to start the agent and query the `/netdata.conf` API endpoint. This worked well enough for checking the effective configuration of a running agent, but it also meant that `edit-config netdata.conf` didn't work as users expect, if there is no `netdata.conf` file. It also meant that you couldn't check the effective configuration if you have the web server disabled. We [have now added](https://github.com/netdata/netdata/pull/14906) the `netdatacli dumpconfig` command, which outputs the current `netdata.conf`, exactly like the web API endpoint does. In the future we will look into making the `edit-config` command a bit smarter, so that it can provide the option to automatically retrieve the live `netdata.conf`. ## Documentation and Demos <a id="v1390-documentation-and-demos"></a> We're excited to announce the completion of a radical overhaul of our documentation site, available at [learn.netdata.cloud](https://learn.netdata.cloud). Our new site features a much clearer organization of content, a streamlined publishing process, and a powerful Google search bar that searches all available resources for articles matching your queries. We've restructured and improved dozens of articles, updating or eliminating obsolete content and deduplicating similar or identical content. These changes help to ensure that our documentation remains up-to-date and easy to navigate. Even seasoned Netdata power users should take a look at our new [Deployment in Production](https://learn.netdata.cloud/docs/deployment-in-production/) section, which includes features and suggestions that you may have missed in the past. We're committed to maintaining the highest standards for our documentation and invite our users to assist us in this effort. The "Edit this page" button, available on all published articles, allows you to suggest updates or improvements by directly editing the source file. We hope that our new documentation site helps you more effectively use and understand our monitoring tool, and we'll continue to make improvements and updates based on your feedback. ## Deprecation notice <a id="v1390-deprecation-notice"></a> The following items will be removed in our next minor release (v1.40.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |--------------------------------------------------------------------------------------------------------------|:---------:|:----------------------------------------------------------------------------------------:| | [python.d/nvidia_smi](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/nvidia_smi) | collector | [go.d/nvidia_smi](https://github.com/netdata/go.d.plugin/tree/master/modules/nvidia_smi) | ### Deprecated in this release <a id="v1390-deprecated-in-this-releaase"></a> In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.38.0#v1380-deprecation), the following items [have been removed](https://github.com/netdata/netdata/pull/14454) in this release: | Component | Type | Replaced by | |--------------------------------------------------------------------------------------------------------------|:---------:|:----------------------------------------------------------------------------------------:| | [python.d/ntpd](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/ntpd) | collector | [go.d/ntpd](https://github.com/netdata/go.d.plugin/tree/master/modules/ntpd) | | [python.d/proxysql](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/proxysql) | collector | [go.d/proxysql](https://github.com/netdata/go.d.plugin/tree/master/modules/proxysql) | | [python.d/rabbitmq](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/rabbitmq) | collector | [go.d/rabbitmq](https://github.com/netdata/go.d.plugin/tree/master/modules/rabbitmq) | ## Netdata Release Meetup <a id="v1390-netdata-release-meetup"></a> Join the Netdata team on the **9th of May, at 16:00 UTC** for the [Netdata Release Meetup](https://discord.gg/YzjpEZtE?event=1102951879954137148). Together we’ll cover: - Release Highlights. - Acknowledgements. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/293274898/) - we look forward to meeting you. ## Support options <a id="v1390-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it! ## Running survey <a id="v1390-running-survey"></a> Helps us make Netdata even greater! We are trying to gather valuable information that is key for us to better position Netdata and ensure we keep bringing more value to you. We would appreciate if you could take some time to answer [this short survey (4 questions only)](https://forms.gle/oCJo4WDJfqfvBZi17). ## Acknowledgements <a id="v1390-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - [@farax4de](https://github.com/farax4de) for [fixing an exception](https://github.com/netdata/netdata/pull/14844) in the [CEPH collector](https://learn.netdata.cloud/docs/data-collection/monitor-anything/Storage/CEPH). - [@ghanapunq](https://github.com/ghanapunq) for [adding ethtool](https://github.com/netdata/netdata/pull/14753) to [the list of third-party collectors](https://learn.netdata.cloud/docs/data-collection/monitor-anything/#third-party-collectors). - [@intelfx](https://github.com/intelfx) for [fixing the accounting of Btrfs unallocated space](https://github.com/netdata/netdata/pull/14824) in the [proc plugin](https://learn.netdata.cloud/docs/data-collection/monitor-anything/System%20Metrics/OS-provided-metrics-proc.plugin). - [@k0ste](https://github.com/k0ste) for [fixing an issue with x86_64 RPM builds with eBPF](https://github.com/netdata/netdata/pull/14552). - [@slavox](https://github.com/slavox) for [fixing a typo in the documentation](https://github.com/netdata/netdata/pull/14854). - [@vobruba-martin](https://github.com/vobruba-martin) for [fixing](https://github.com/netdata/netdata/pull/14424) the `--release-channel` and `--nightly-channel` options in `kickstart.sh`. - [@bompus](https://github.com/bompus) for [fixing an issue preventing non-interactive installs of our static builds from fully automatic](https://github.com/netdata/netdata/pull/14950). - [@D34DC3N73R](https://github.com/D34DC3N73R) for [adding Docker instructions to enable NVIDIA GPUs](https://github.com/netdata/netdata/pull/14924) 2023-05-08T14:49:58+00:00 netdata v1.39.1 netdata v1.39.1 2023-05-18T14:21:04+00:00 This patch release provides the following bug fixes: - We noticed that claiming and enabling auto-updates have been failing due to incorrect permissions when `kickstart.sh` was doing a static installation. The issue has affected all static installations, including the one done from the Windows MSI installer. The permissions [have now been corrected](https://github.com/netdata/netdata/pull/15042). - The recipient lists of agent alert notifications are configurable via the `health_alarm_notify.conf` file. A stock file with default configurations can be modified using `edit-config`. [@jamgregory](https://github.com/jamgregory) noticed that the default settings in that file can make changing role recipients confusing. Unless the edited configuration file included every setting of the original stock file, the resulting behavior was unintuitive. [@jamgregory](https://github.com/jamgregory) kindly added a PR to [fix the handling of custom role recipient configurations](https://github.com/netdata/netdata/pull/15047). - A bug in our collection and reporting of Infiniband bandwidth was discovered and [fixed](https://github.com/netdata/netdata/pull/14748). - We noticed memory buffer overflows under some very specific conditions. We [adjusted](https://github.com/netdata/netdata/pull/15025) the relevant buffers and the calls to `strncpyz` to prevent such overflows. - A memory leak in certain circumstances was found in the ACLK code. We [fixed the the incorrect data handling that caused it](https://github.com/netdata/netdata/pull/15055). - An unrelated memory leak was discovered in the ACLK code and [has also been fixed](https://github.com/netdata/netdata/pull/15060). - Exposing the anomaly rate right on top of each chart in Netdata Cloud surfaced an issue of [bad ML models on some very noisy metrics](https://github.com/netdata/netdata/discussions/14993). We addressed the issue by [suppressing the indications](https://github.com/netdata/netdata/pull/15011) that these noisy metrics would produce. This change gives the ML model a chance to improve, based on additional collected data. - Finally, we [improved the handling of errors during ML transactions](https://github.com/netdata/netdata/pull/15013), so that transactions are properly rolled back, instead of failing in the middle. ## Support options <a id="v1391-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hangout with like-minded sysadmins, DevOps, SREs and other troubleshooters. More than 1300 engineers are already using it! 2023-05-18T14:21:04+00:00 netdata v1.40.0 netdata v1.40.0 2023-06-14T17:13:28+00:00 - [Netdata Growth](#v1400-netdata-open-source-growth) - [Release Highlights](#v1400-release-highlights) - **[Dashboard Sections' Summary Tiles](#v1400-visualization-summary-dashboards)**<br/> Added summary tiles to most sections of the fully-automated dashboards, to provide an instant view of the most important metrics for each section. - **[Silencing of Cloud Alert Notifications](#v1400-alert-notification-silencing)**<br/> Maintenance window coming up? Active issue being checked? Use the Alert notification silencing engine to mute your notifications. - **[Machine Learning - Extended Training to 24 Hours](#v1400-ml-extended-training)**<br/> Netdata now trains multiple models per metric, to learn the behavior of each metric for the last 24 hours. Trained models are persisted on disk and are loaded back on Netdata restart. - **[Rewritten SSL Support for the Agent](#v1400-streaming)**<br/> Netdata Agent now features a new SSL layer that allows it to reliably use SSL on all its features, including the API and Streaming. - [Alerts and Notifications](#v1400-alerts) - [Visualizations / Charts and Dashboards](#v1400-visualizations) - [Preliminary steps to split native packages](#v1400-packaging-split) - [Acknowledgements](#v1400-acknowledgements) - [Contributions](#v1400-contributions) - [Collectors](#v1400-contributions-collectors) - [Documentation](#v1400-contributions-documentation) - [Packaging / Installation](#v1400-contributions-packaging) - [Streaming](#v1400-contributions-streaming) - [Health](#v1400-contributions-health) - [Exporting](#v1400-contributions-exporting) - [ML](#v1400-contributions-ml) - [Other notable changes](#v1400-contributions-other) - [Deprecation notice](#v1400-deprecation-notice) - [Cloud recommended version](#v1400-cloud-recommended-version) - [Release meetup](#v1400-release-meetup) - [Support options](#v1400-support-options) - [Running survey](#v1400-running-survey) ## Netdata Growth <a id="v1400-netdata-open-source-growth"></a> 🚀 Our community growth is increasing steadily. ❤️ Thank you! Your love and acceptance give us the energy and passion to work harder to simplify and make monitoring easier, more effective and more fun to use. <!-- Retrieve most of these stats from netdata/netdata/README.md badges --> - Over 63,000 GitHub Stars ⭐ - Over 1.5 million online nodes - Almost 94 million sessions served - Over 600 thousand total nodes in Netdata Cloud<br/> **Wow! Netdata Cloud is about to become the biggest and most scalable monitoring infra ever created!** > _Let the world know you love Netdata. > **[Give Netdata a ⭐ on GitHub](https://github.com/netdata/netdata) now.** > Motivate us to keep pushing forward!_ ### Unlimited Docker Hub Pulls! To help our community use Netdata more broadly, we just signed an agreement with Docker for the purchase of Rate Limit Removal, which will remove all Docker Hub pull limits for the Netdata repos at Docker Hub. We expect this add-on to be applied to our repos in the following few days, so that you will enjoy **unlimited Docker Hub pulls of Netdata Docker images for free**! ## Release Highlights <a id="v1400-release-highlights"></a> ### Dashboard Sections' Summary Tiles <a id="v1400-visualization-summary-dashboards"></a> Netdata Cloud dashboards have been improved to provide instant summary tiles for most of their sections. This includes system overview, disks, network interfaces, memory, mysql, postgresql, nginx, apache, and dozens more. To accomplish this, we extended the query engine of Netdata to support multiple grouping passes, so that queries like "sum metrics by label X, and then average by node" are now possible. At the same time we made room for presenting anomaly rates on them (vertical purple bar on the right) and significantly improved the tile placement algorithm to support multi-line summary headers and precise sizing and positioning, providing a look and feel like this: ![image](https://github.com/netdata/learn/assets/70198089/a9c54bf4-c3db-40b0-9fc3-d043ae911589) The following chart tile types have been added: - Donut <img src="https://github.com/netdata/learn/assets/70198089/80b015ee-ce1d-4fd7-85d6-e7c9026f3277" height="250"></img> - Gauge <img src="https://github.com/netdata/learn/assets/70198089/494c946c-71ef-465a-a975-4d4e4a75ff01" height="250"></img> - Bar <img src="https://github.com/netdata/learn/assets/70198089/e5a7b490-a237-4462-98cd-a8b83d9aa0d7" height="250"></img> - Trendline <img src="https://github.com/netdata/learn/assets/70198089/fe1ba270-83c2-4c6a-880a-ef10d13647ce" height="250"></img> - Number <img src="https://github.com/netdata/learn/assets/70198089/68663ea8-c358-40ee-aef7-29b6c21c30b9" height="250"></img> - Pie chart <img src="https://github.com/netdata/learn/assets/70198089/9d37c83f-80a7-4ccd-ab4b-a13356d32951" height="250"></img> To improve the efficiency of using these tiles, each of these tiles supports the following interactive actions: 1. Clicking the title of the tile scroll the dashboard to the data source chart, where you can slice, dice and filter the data based on which the tile was created. 2. Hovering the tile with your mouse pointer, the NIDL (Nodes, Instances, Dimensions, Labels) framework buttons appear, allowing you to explore and filter the data set, right on the tile. Some examples that you can see from the Netdata Demo space: * [CPU](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_cpu) * [Containers & VMs](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_cgroup) * [K8s containers](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_Kubernetes_Containers) * [K8s state](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_Kubernetes_State) * [NGINX Plus](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_nginxplus) * [PostgreSQL](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_postgres) * [Windows](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/overview#metrics_correlation=false&after=-900&before=0&utc=Europe%2FLisbon&offset=%2B1&timezoneName=Dublin%2C%20Lisbon&modal=&modalTab=&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a--chartName=menu_windows) ### Silencing of Cloud Alert Notifications <a id="v1400-alert-notification-silencing"></a> Although Netdata Agent alerts support silencing, centrally dispatched alert notifications from Netdata Cloud were missing that feature. Today, we release alert notifications silencing rules for Netdata Cloud! Silencing rules are applied on any combination of the following: users, rooms, nodes, host labels, contexts (charts), alert name, alert role. For the matching alerts, silencing can optionally have a starting date and time and/or an ending date time. With this feature you can now easily setup silencing rules, which can be set to be applied immediately or at a defined schedule, allowing you to plan for upcoming schedule maintenance windows - see some examples [here](https://learn.netdata.cloud/docs/alerts-and-notifications/notifications/netdata-cloud-notifications/manage-alert-notification-silencing-rules#silencing-rules-examples). ![Image](https://github.com/netdata/learn/assets/70198089/6e3593e0-2a2b-4457-b007-50713ee49c2e) Read more about Silencing Alert notifications on [our documentation](https://learn.netdata.cloud/docs/alerting/notifications/netdata-cloud-notifications/#silencing-alert-notifications). ### Machine Learning - Extended Training to 24 Hours <a id="v1400-ml-extended-training"></a> Netdata trains ML models for each metric, using its past data. This allows Netdata to detect anomalous behaviors in metrics, based exclusively on the recent past data of the metric itself. Before this release Netdata was training one model of each metric, learning the behavior of each metric during the last 4 hours. In the previous release we introduced persisting these models to disk and loading them back when Netdata restarts. In this release we change the [default ML settings](https://github.com/netdata/netdata/pull/15093) to support multiple models per metric, maintaining multiple trained models per metric, covering the behavior of each metric for last 24 hours. All these models are now consulted automatically in order to decide if a data collection point is anomalous or not. This has been implemented in a way to avoid introducing additional CPU overhead on Netdata agents. So, instead of training one model for 24 hours which would introduce significant query overhead on the server, we train each metric every 3 hours using the last 6 hours of data, and we keep 9 models per metric. The most recent model is consulted first during anomaly detection. Additional models are consulted as long as the previous ones predict an anomaly. So only when all 9 models agree that a data collection is anomalous, we mark the collected sample as anomalous in the database. The impact of these changes is more accurate anomaly detection out of the box, with much fewer false positives. You can read more about it in [this deck](https://docs.google.com/presentation/d/18k0Q_JBMHZYLLo_Zl3spiWF_k_VMEXz92QFxTrpQe3Q/edit?usp=sharing) presented during a recent office hours ([office hours recording](https://youtu.be/2ZdffnGcX4w)). ### Rewritten SSL Support for the Agent <a id="v1400-streaming"></a> The SSL support at the Netdata Agent has been completely rewritten. The new code now reliably support SSL connections for both the Netdata internal web server and streaming. It is also easier to understand, troubleshoot and expand. At the same time performance has been improved by removing redundant checks. During this process a long-standing bug on streaming connection timeouts has been identified and fixed, making streaming reliable and robust overall. ## Alerts and Notifications <a id="v1400-alerts"></a> ### Mattermost notifications for Business Plan users <a id="v1400-mattermost-notifications"></a> To keep building up on our set of existing alert notification methods we added Mattermost as another notification integration option on Netdata Cloud. As part of our commitment to expanding our set of alert notification methods, Mattermost provides another reliable way to deliver alerts to your team, ensuring the continuity and reliability of your services. Business Plan users can now configure Netdata Cloud to send alert notifications to their team on Mattermost. ![image](https://github.com/netdata/learn/assets/70198089/1d1ee168-db1d-414f-a8c7-44418fd0ae22) ## Visualizations / Charts and Dashboards <a id="v1400-visualizations"></a> ### Netdata Functions <a id="v1400-visualization-netdata-functions"></a> On top of the work done on release v1.38, where we introduced real-time [functions](https://github.com/netdata/netdata/releases/tag/v1.38.0#v1380-functions) that enable you to trigger specific routines to be executed by a given Agent on demand. Our initial function provided detailed information on currently running processes on the node, effectively replacing top and iotop. We have now added the capability to group your results by specific attributes. For example, on the **Processes** function you are now able to group the results by: _Category_, _Cmd_ or _User_. With this capability you can now get a consolidated view of your reported statistics over any of these attributes. ![image](https://github.com/netdata/learn/assets/70198089/28115a0f-1c1a-4f87-8fd1-cce28dfb3620) ## External plugin integration The agent core has been improved when it comes to integration with external plugins. Under certain conditions, a failed plugin would not be correctly acknowledged by the agent resulting in a defunc (i.e. zombie) plugin process. This is now fixed. ## Preliminary steps to split native packages <a id="v1400-packaging-split"></a> Starting with this release, our official DEB/RPM packages have been split so that each external data collection plugin is in its own package instead of having everything bundled into a single package. We have previously had our CUPS and FreeIPMI collectors split out like this, but this change extends that to almost all of our external data collectors. This is the first step towards making these external collectors optional on installs that use our native packages, which will in turn allow users to avoid installing things they don’t actually need. Short-term, these external collectors are listed as required dependencies to ensure that updates work correctly. At some point in the future almost all of them will be changed to be optional dependencies so that users can pick and choose which ones they want installed. This change also includes a large number of fixes for minor issues in our native packages, including better handling of user accounts and file permissions and more prevalent usage of file capabilities to improve the security of our native packages. ## Acknowledgements <a id="v1400-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - [@n0099](https://github.com/n0099) for fixing typos in the documentation. - [@mochaaP](https://github.com/mochaaP) for fixing cross-compiling issues. - [@jmphilippe](https://github.com/jmphilippe) for making control address configurable in python.d/tor. - [@TougeAI](https://github.com/TougeAI) for documenting the "age" configuration option in python.d/smartd_log. - [@mochaaP](https://github.com/mochaaP) for adding support of python-oracledb to python.d/oracledb. ## Contributions <a id="v1400-contributions"></a> ### Collectors <a id="v1400-contributions-collectors"></a> #### Improvements - Add parent_table label to table/index metrics (go.d/postgres) ([#1199](https://github.com/netdata/go.d.plugin/pull/1199), [@ilyam8](https://github.com/ilyam8)) - Make tables and indexes limit configurable (go.d/postgres) ([#1200](https://github.com/netdata/go.d.plugin/pull/1200), [@ilyam8](https://github.com/ilyam8)) - Add Hyper-V metrics (go.d/windows) ([#1164](https://github.com/netdata/go.d.plugin/pull/1164), [@thiagoftsm](https://github.com/thiagoftsm)) - Add "maps per core" config option (ebpf.plugin) ([#14691](https://github.com/netdata/netdata/pull/14691), [@thiagoftsm](https://github.com/thiagoftsm)) - Add plugin that collect metrics from /sys/fs/debugfs (debugfs.plugin) ([#15017](https://github.com/netdata/netdata/pull/15017), [@thiagoftsm](https://github.com/thiagoftsm)) - Add support of python-oracledb (python.d/oracledb) ([#15074](https://github.com/netdata/netdata/pull/15074), [@EricAndrechek](https://github.com/EricAndrechek)) - Make control address configurable (python.d/tor) ([#15041](https://github.com/netdata/netdata/pull/15041), [@jmphilippe](https://github.com/jmphilippe)) - Make connection protocol configurable (python.d/oracledb) ([#15104](https://github.com/netdata/netdata/pull/15104), [@ilyam8](https://github.com/ilyam8)) - Add availability status chart and alarm (freeipmi.plugin) ([#15151](https://github.com/netdata/netdata/pull/15151), [@ilyam8](https://github.com/ilyam8)) - Improve error messages when legacy code is not installed (ebpf.plugin) ([#15146](https://github.com/netdata/go.d.plugin/pull/15146), [@thiagoftsm](https://github.com/thiagoftsm)) #### Bug fixes - Fix handling of newlines in HELP (go.d/prometheus) ([#1196](https://github.com/netdata/go.d.plugin/pull/1196), [@ilyam8](https://github.com/ilyam8)) - Fix collection of bind mounts (diskspace.plugin) ([#14831](https://github.com/netdata/netdata/pull/14831), [@MrZammler](https://github.com/MrZammler)) - Fix collection of zero metrics if Zswap is disabled (debugfs.plugin) ([#15054](https://github.com/netdata/netdata/pull/15054), [@ilyam8](https://github.com/ilyam8)) #### Other - Document the "age" configuration option (python.d/smartd_log) ([#15171](https://github.com/netdata/netdata/pull/15171), [@TougeAI](https://github.com/TougeAI)) - Send EXIT before exiting in (freeipmi.plugin, debugfs.plugin) ([#15140](https://github.com/netdata/netdata/pull/15140), [@ilyam8](https://github.com/ilyam8)) ### Documentation <a id="v1400-contributions-documentation"></a> - Add Mattermost cloud integration docs ([#15141](https://github.com/netdata/netdata/pull/15141), [@car12o](https://github.com/car12o)) - Update Events and Silencing Rules docs ([#15134](https://github.com/netdata/netdata/pull/15134), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix a typo in simple patterns readme ([#15135](https://github.com/netdata/netdata/pull/15135), [@n0099](https://github.com/n0099)) - Add netdata demo rooms to the list of demo urls ([#15120](https://github.com/netdata/netdata/pull/15120), [@andrewm4894](https://github.com/andrewm4894)) - Add initial draft for the silencing docs ([#15112](https://github.com/netdata/netdata/pull/15112), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Create category overview pages for Learn restructure ([#15091](https://github.com/netdata/netdata/pull/15091), [@Ancairon](https://github.com/Ancairon)) - Mention waive off of space subscription price ([#15082](https://github.com/netdata/netdata/pull/15082), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update Security doc ([#15072](https://github.com/netdata/netdata/pull/15072), [@tkatsoulas](https://github.com/tkatsoulas)) - Update netdata-security.md ([#15068](https://github.com/netdata/netdata/pull/15068), [@cakrit](https://github.com/cakrit)) - Fix wording in interact with charts doc ([#15040](https://github.com/netdata/netdata/pull/15040), [@Ancairon](https://github.com/Ancairon)) - Fix wording in the database readme ([#15034](https://github.com/netdata/netdata/pull/15034), [@Ancairon](https://github.com/Ancairon)) - Update troubleshooting-agent-with-cloud-connection.md ([#15029](https://github.com/netdata/netdata/pull/15029), [@cakrit](https://github.com/cakrit)) - Update the billing docs for the flow ([#15014](https://github.com/netdata/netdata/pull/15014), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update chart documentation ([#15010](https://github.com/netdata/netdata/pull/15010), [@Ancairon](https://github.com/Ancairon)) ### Packaging / Installation <a id="v1400-contributions-packaging"></a> - Fix package conflicts policy on deb based packages ([#15170](https://github.com/netdata/netdata/pull/15170), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix user and group handling in DEB packages ([#15166](https://github.com/netdata/netdata/pull/15166), [@Ferroin](https://github.com/Ferroin)) - Change mandatory packages for RPMs ([#15165](https://github.com/netdata/netdata/pull/15165), [@tkatsoulas](https://github.com/tkatsoulas)) - Restrict ebpf dep in DEB package to amd64 only ([#15161](https://github.com/netdata/netdata/pull/15161), [@Ferroin](https://github.com/Ferroin)) - Make plugin packages hard dependencies ([#15160](https://github.com/netdata/netdata/pull/15160), [@Ferroin](https://github.com/Ferroin)) - Update libbpf to v1.2.0 ([#15038](https://github.com/netdata/netdata/pull/15038), [@thiagoftsm](https://github.com/thiagoftsm)) - Provide necessary permission for the kickstart to run the netdata-updater script ([#15132](https://github.com/netdata/netdata/pull/15132), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix bundling of eBPF legacy code for DEB packages ([#15127](https://github.com/netdata/netdata/pull/15127), [@Ferroin](https://github.com/Ferroin)) - Fix package versioning issues ([#15125](https://github.com/netdata/netdata/pull/15125), [@Ferroin](https://github.com/Ferroin)) - Fix handling of eBPF plugin for DEB packages ([#15117](https://github.com/netdata/netdata/pull/15117), [@Ferroin](https://github.com/Ferroin)) - Improve some of the error messages in the kickstart script ([#15061](https://github.com/netdata/netdata/pull/15061), [@Ferroin](https://github.com/Ferroin)) - Split plugins to individual packages for DEB/RPM packaging ([#13927](https://github.com/netdata/netdata/pull/13927), [@Ferroin](https://github.com/Ferroin)) - Update agent telemetry url to be cloud function instead of posthog ([#15085](https://github.com/netdata/netdata/pull/15085), [@andrewm4894](https://github.com/andrewm4894)) - Remove Fedora 36 from CI and platform support. ([#14938](https://github.com/netdata/netdata/pull/14938), [@Ferroin](https://github.com/Ferroin)) - Fix a fatal in the claiming script when the main action is not claiming ([#15039](https://github.com/netdata/netdata/pull/15039), [@ilyam8](https://github.com/ilyam8)) - Remove old logic for handling of legacy stock config files ([#14829](https://github.com/netdata/netdata/pull/14829), [@Ferroin](https://github.com/Ferroin)) - Make zlib compulsory dep ([#14928](https://github.com/netdata/netdata/pull/14928), [@underhood](https://github.com/underhood)) - Replace JudyLTablesGen with generated files ([#14984](https://github.com/netdata/netdata/pull/14984), [@mochaaP](https://github.com/mochaaP)) - Update SQLITE to version 3.41.2 ([#15031](https://github.com/netdata/netdata/pull/15031), [@stelfrag](https://github.com/stelfrag)) ### Streaming <a id="v1400-contributions-streaming"></a> - Streaming improvements and rewrite of SSL support in Netdata ([#15113](https://github.com/netdata/netdata/pull/15113), [@ktsaou](https://github.com/ktsaou)) ### Health <a id="v1400-contributions-health"></a> - Fix cockroachdb alarms ([#15095](https://github.com/netdata/netdata/pull/15095), [@ilyam8](https://github.com/ilyam8)) - Use chart labels to filter alarms ([#14982](https://github.com/netdata/netdata/pull/14982), [@MrZammler](https://github.com/MrZammler)) - Remove "families" from alarm configs ([#15086](https://github.com/netdata/netdata/pull/15086), [@ilyam8](https://github.com/ilyam8)) ### Exporting <a id="v1400-contributions-exporting"></a> - Add chart labels to Prometheus exporter ([#15099](https://github.com/netdata/netdata/pull/15099), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix out-of-order labels in Prometheus exporter ([#15094](https://github.com/netdata/netdata/pull/15094), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix out-of-order labels in Prometheus remote write exporter ([#15097](https://github.com/netdata/netdata/pull/15097), [@thiagoftsm](https://github.com/thiagoftsm)) ### ML <a id="v1400-contributions-ml"></a> - Update ML defaults to 24h ([#15093](https://github.com/netdata/netdata/pull/15093), [@andrewm4894](https://github.com/andrewm4894)) ### Other Notable Changes <a id="v1400-contributions-other"></a> #### Improvements - Reduce netdatacli size ([#15024](https://github.com/netdata/netdata/pull/15024), [@stelfrag](https://github.com/stelfrag)) - Make percentage-of-group aggregatable at cloud ([#15126](https://github.com/netdata/netdata/pull/15126), [@ktsaou](https://github.com/ktsaou)) - Add percentage calculation on grouped queries to /api/v2/data ([#15100](https://github.com/netdata/netdata/pull/15100), [@ktsaou](https://github.com/ktsaou)) - Add status information and streaming stats to /api/v2/nodes ([#15162](https://github.com/netdata/netdata/pull/15162), [@ktsaou](https://github.com/ktsaou)) #### Bug fixes - Fix the units when returning percentage of a group ([#15105](https://github.com/netdata/netdata/pull/15105), [@ktsaou](https://github.com/ktsaou)) - Fix uninitialized array vh in percentage-of-group ([#15106](https://github.com/netdata/netdata/pull/15106), [@ktsaou](https://github.com/ktsaou)) - Fix not respecting maximum message size limit of MQTT server ([#15009](https://github.com/netdata/netdata/pull/15009), [@underhood](https://github.com/underhood)) - Fix not freeing context when establishing an ACLK connection ([#15073](https://github.com/netdata/netdata/pull/15073), [@stelfrag](https://github.com/stelfrag)) - Fix sanitizing square brackets in label value ([#15131](https://github.com/netdata/netdata/pull/15131), [@ilyam8](https://github.com/ilyam8)) - Fix crash when UUID is NULL in SQLite ([#15147](https://github.com/netdata/netdata/pull/15147), [@stelfrag](https://github.com/stelfrag)) #### Code organization - Add initial minimal h2o webserver integration ([#14585](https://github.com/netdata/netdata/pull/14585), [@underhood](https://github.com/underhood)) - Release buffer in case of error -- CID 385075 ([#15090](https://github.com/netdata/netdata/pull/15090), [@stelfrag](https://github.com/stelfrag)) - Improve cleanup of health log table ([#15045](https://github.com/netdata/netdata/pull/15045), [@MrZammler](https://github.com/MrZammler)) - Simplify loop in alert checkpoint ([#15065](https://github.com/netdata/netdata/pull/15065), [@MrZammler](https://github.com/MrZammler)) - Only queue an alert to the cloud when it's inserted ([#15110](https://github.com/netdata/netdata/pull/15110), [@MrZammler](https://github.com/MrZammler)) - Generate, store and transmit a unique alert event_hash_id ([#15111](https://github.com/netdata/netdata/pull/15111), [@MrZammler](https://github.com/MrZammler)) - Fix syntax in config.ac ([#15139](https://github.com/netdata/netdata/pull/15139), [@underhood](https://github.com/underhood)) - Add library to encode/decode Gorilla compressed buffers. ([#15128](https://github.com/netdata/netdata/pull/15128), [@vkalintiris](https://github.com/vkalintiris)) - Fix coverity issues ([#15169](https://github.com/netdata/netdata/pull/15169), [@stelfrag](https://github.com/stelfrag)) - Fix CID 385073 -- Uninitialized scalar variable ([#15163](https://github.com/netdata/netdata/pull/15163), [@stelfrag](https://github.com/stelfrag)) - Fix CodeQL warning ([#15062](https://github.com/netdata/netdata/pull/15062), [@stelfrag](https://github.com/stelfrag)) ## Deprecation notice <a id="v1400-deprecation-notice"></a> The following items will be removed in our next minor release (v1.41.0): > Patch releases (if any) will not be affected. | Component | Type | Will be replaced by | |--------------------------------------------------------------------------------------------------------------|:----------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| | [python.d/nvidia_smi](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/nvidia_smi) | collector | [go.d/nvidia_smi](https://github.com/netdata/go.d.plugin/tree/master/modules/nvidia_smi) | | `family` attribute | alert configuration and Health API | [chart labels](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-chart-labels) attribute (more details on [netdata#15030](https://github.com/netdata/netdata/issues/15030))| ## Cloud recommended version <a id="v1400-cloud-recommended-version"></a> When using Netdata Cloud, the required agent version to take most benefits from the latest features is **one version before the last stable**. On this release this will become `v1.39.1` and you'll be notified and guided to take action on the UI if you are running agents on lower versions. Check here for details on how to [Update Netdata](https://learn.netdata.cloud/docs/maintenance-operations-on-netdata-agents/update-netdata-agents) agents. ## Netdata Release Meetup <a id="v1400-release-meetup"></a> Join the Netdata team on the **19th of June at 16:00 UTC** for the [Netdata Release Meetup](https://discord.gg/Ysd4rrpt?event=1118488564427137044). Together we’ll cover: - Release Highlights. - Acknowledgements. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/294178290) - we look forward to meeting you. ## Support options <a id="v1400-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it! ## Running survey <a id="v1400-running-survey"></a> Helps us make Netdata even greater! We are trying to gather valuable information that is key for us to better position Netdata and ensure we keep bringing more value to you. We would appreciate if you could take some time to answer [this short survey (4 questions only)](https://forms.gle/oCJo4WDJfqfvBZi17). 2023-06-14T17:13:28+00:00 netdata v1.40.1 netdata v1.40.1 2023-06-27T16:43:27+00:00 Netdata v1.40.1 is a patch release to address issues discovered since [v1.40.0](https://github.com/netdata/netdata/releases/tag/v1.40.0). This patch release provides the following bug fixes: - Fixed ebpf sync thread crash ([#15174](https://github.com/netdata/netdata/pull/15174), [thiagoftsm](https://github.com/thiagoftsm)). - Fixed ebpf threads taking too long to terminate ([#15187](https://github.com/netdata/netdata/pull/15187), [thiagoftsm](https://github.com/thiagoftsm)). - Fixed building with eBPF on RPM systems due to missing build dependency ([#15192](https://github.com/netdata/netdata/pull/15192), [k0ste](https://github.com/k0ste)). - Fixed building on macOS due to incorrect include directive ([#15195](https://github.com/netdata/netdata/pull/15195), [nandahkrishna](https://github.com/nandahkrishna)). - Fixed a crash during health log entry processing ([#15209](https://github.com/netdata/netdata/pull/15209), [stelfrag](https://github.com/stelfrag)). - Fixed architecture detection on i386 when building native packages ([#15218](https://github.com/netdata/netdata/pull/15218), [ilyam8](https://github.com/ilyam8)). - Fixed SSL non-blocking retry handling in the web server ([#15222](https://github.com/netdata/netdata/pull/15222), [ktsaou](https://github.com/ktsaou)). - Fixed handling of plugin ownership in static builds ([#15230](https://github.com/netdata/netdata/pull/15230), [Ferroin](https://github.com/Ferroin)). - Fixed an exception in python.d/nvidia_smi due to not handling N/A value ([#15231](https://github.com/netdata/netdata/pull/15231), [ilyam8](https://github.com/ilyam8)). - Fixed installing the wrong systemd unit file on older RPM systems ([#15240](https://github.com/netdata/netdata/pull/15240), [Ferroin](https://github.com/Ferroin)). - Fixed creation of charts for network interfaces of virtual machines/containers as normal network interface charts ([#15244](https://github.com/netdata/netdata/pull/15244), [ilyam8](https://github.com/ilyam8)). - Fixed building on openSUSE Leap 15.4 due to incorrect $(libh2o_dir) expansion ([#15253](https://github.com/netdata/netdata/pull/15253), [Dim-P](https://github.com/Dim-P)). ## Acknowledgements <a id="v1401-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @k0ste for fixing building with eBPF on RPM systems. - @nandahkrishna for fixing building on macOS. ## Support options <a id="v1401-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it! 2023-06-27T16:43:27+00:00 netdata v1.41.0 netdata v1.41.0 2023-07-19T21:37:16+00:00 Checkout the [v1.41 release meetup recording](https://www.youtube.com/watch?v=WCUn4-LneCw) or read on to learn more about the new UI and other features in this release. [![netdata release notes meetup](https://img.youtube.com/vi/WCUn4-LneCw/0.jpg)](https://www.youtube.com/watch?v=WCUn4-LneCw) - [Netdata Growth ](#v1410-netdata-open-source-growth) - [Release Highlights](#v1410-release-highlights) - **[New Agent Dashboard!](#v1410-one-dashboard)** - [Netdata Assistant](#v1410-netdata-assistant) - [New FreeIPMI collector for monitoring enterprise hardware](#v1410-netdata-freeipmi) - [Netdata Detects FDs Leaking](#v1410-netdata-apps) - [Acknowledgements ](#v1410-acknowledgements) - [Contributions](#v1410-contributions) - [Collectors](#v1410-contributions-collectors) - [Documentation ](#v1410-contributions-documentation) - [Packaging/Installation](#v1410-contributions-packaging) - [Health](#v1410-contributions-health) - [Exporting](#v1410-contributions-exporting) - [Other Notable Changes](#v1410-contributions-other) - [Deprecation notice](#v1410-deprecation-notice) - [Deprecated in this release](#v1410-deprected-in-this-release) - [Netdata Release Meetup](#v1410-netdata-release-meetup) - [Support options](#v1410-support-options) Steady to our schedule, this is another great Netdata release! ## Netdata Growth <a id="v1410-netdata-open-source-growth"></a> - 64 k GitHub Stars ⭐ - 1.7 M monitored nodes - 570+ M docker hub pulls > **[Give Netdata a ⭐ too, on Github!](https://github.com/netdata/netdata)** ❤️ Thank you for your love! 🚀 You rock! ## Release Highlights <a id="v1410-release-highlights"></a> ### New Agent Dashboard <a id="v1410-one-dashboard"></a> Netdata Agents and Parents now have a new UI! New **CHARTS** :green_circle: New **SUMMARIES** :green_circle: **MACHINE-LEARNING FIRST** :green_circle: **INFRASTRUCTURE LEVEL DASHBOARDS** :green_circle: **FILTER**, **SLICE**, and **DICE** any dataset :green_circle: **ANOMALY ADVISOR** :green_circle: **METRICS CORRELATIONS** :green_circle: **NETDATA FUNCTIONS** :green_circle: **EVENTS FEED** :green_circle: **HEATMAPS** :green_circle: ![Netdata Agent](https://github.com/netdata/netdata/assets/2662304/af4caa23-19be-46ef-9779-8fdad8d99d2a) In the last few months, we have ported and open-sourced all Netdata Cloud APIs to the Netdata Agent, allowing Netdata Parents to drive the same multi-node / infrastructure level dashboards Netdata Cloud provides! So, as of today, Netdata Agents and Parents present the same UI, exactly the same dashboard, charts and features with Netdata Cloud! #### Single Node Dashboard Changes Apart from the entirely new look, single-node dashboards now group similar charts together. So, all disk drives, network interfaces, cgroups (containers and VMs), are now a single set of charts. This allows Netdata to aggregate a vast amount of datasets in a chart, like the following, where almost 20k containers are now manageable: ![image](https://github.com/netdata/netdata/assets/2662304/b4f8d79b-a3f9-4b15-a75b-a20ab76f84ae) To make it easier for you to navigate, filter, slice, and dice the data, the menus above each chart give you easy access to all the data of the chart: ![Netdata Agent 2](https://github.com/netdata/netdata/assets/2662304/49981d85-89b9-4b95-8e45-da7a39e6dd48) #### Multi Node Dashboards When Netdata Agents are configured as Parents (multiple other agents stream metrics to them), they now present multi-node and multi-instance charts. At the top right corner of the dashboard, there is the global nodes filter, from which you can slice the entire dashboard for one or a few of your nodes. ![image](https://github.com/netdata/netdata/assets/2662304/07753520-ebd7-423e-9105-d1cba106035c) #### Want to know more? Get a firsthand walkthrough with Costa Tsaousis, Netdata's Founder, on the rationale for this change and the path Netdata is taking by checking the video from Netdata Office Hours on [YouTube](https://www.youtube.com/live/UNnQMetWDZI?feature=share&t=840). #### The old dashboards are still accessible You can still access all versions of the dashboards, as follows: - **`http://your.server:19999/`** The default dashboard is now a **live** version of the new UI. The dashboard static files are served by Cloudflare and are automatically updated when we release a new version of the UI, so that your Netdata agent is always up to date. - **`http://your.server:19999/v2/`** A **local copy** of the latest dashboard, as it was at the time the agent was released. This is distributed with Netdata under the [Netdata Cloud UI License v1.0](https://github.com/netdata/netdata/blob/master/web/gui/v2/LICENSE.md). The local copy is automatically used if for any reason the web browser cannot download the live version of it. - **`http://your.server:19999/v1/`** The previous single-node version of the Netdata Agent dashboard. - **`http://your.server:19999/v0/`** The now ancient, original version of the Netdata Agent dashboard. ### Netdata Assistant <a id="v1410-netdata-assistant"></a> Netdata Assistant: Your AI-Powered Troubleshooting Sidekick The Netdata Assistant is an AI-powered tool that uses large language models and our community's knowledge to guide you during troubleshooting and help you get to the root cause sooner. The goal of the Netdata Assistant is straightforward: to make your troubleshooting process easier. It's here to save you from the hassle of sifting through tons of information so you can focus on solving the problem at hand. It will give you the lowdown on the alert, why it's happening, and why you should care. It'll also guide you on how to troubleshoot it and even offer some handy web links for more info if you're interested. ![image](https://github.com/netdata/netdata/assets/82235632/c290084a-d006-4f42-b175-0ebbf03ef3fc) Read more about it on the Netdata blog [here](https://blog.netdata.cloud/netdata-assistant/). ### New FreeIPMI collector for monitoring enterprise hardware <a id="v1410-netdata-freeipmi"></a> Netdata got a new FreeIPMI collector. The new collector is able to collect IPMI sensors at a much better data collection rate, and it is more reliable and robust compared to the previous one. We have also categorized all sensors based on the component they monitor: ![image](https://github.com/netdata/netdata/assets/2662304/9bd23ffa-4166-43a3-947b-0b4b893867e7) And provided as labels the exact sensor name each metric refers to: ![image](https://github.com/netdata/netdata/assets/2662304/1892a54f-a313-40d2-8d11-f4a757cbf652) ### Netdata Detects FDs Leaking <a id="v1410-netdata-apps"></a> "FD" stands for "file descriptor". A file descriptor is an integer that the operating system assigns to an open file to track it. This includes regular data files, directories, network sockets, pipes, and other types of I/O streams. In Linux, everything is treated as a file, which includes hardware devices, directories, and sockets. Each open file is assigned a file descriptor. When a file is closed, its file descriptor is freed up for reuse. However, if an application doesn't close a file when it's done with it, that's called a "file descriptor leak". File descriptor leaks can cause several problems: 1. **Resource exhaustion:** Each process has a limit to the number of file descriptors it can open. If a process continually leaks file descriptors without closing them, it will eventually hit this limit and won't be able to open any more files, which often causes the process to crash. 2. **Unexpected behavior:** Open file descriptors hold resources, like network sockets, that might be expected to be available for other uses. If these resources are tied up due to a leak, it can cause unexpected behavior. 3. **Security issues:** File descriptors can sometimes be used to gain unauthorized access to data if they're not properly managed. `apps.plugins` is now able to track the usage of FDs against the limits set for each application. We have added an `fds` category in the `Applications` section of the dashboard. The first chart shows the percentage of FDs used by each application against its limits: ![image](https://github.com/netdata/netdata/assets/2662304/d621cc94-4c7d-4478-8778-78fbba62919e) ## Acknowledgements <a id="v1410-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @k0ste for improving Prometheus exporting doc. - @carlocab for replacing `info` macro with a less generic name. - @MYanello for updating the pfSense package installation instructions. ## Contributions <a id="v1410-contributions"></a> ### Collectors <a id="v1410-contributions-collectors"></a> #### Improvements - Improve of fds monitoring (apps.plugin) ([#15437](https://github.com/netdata/netdata/pull/15437), [@ktsaou](https://github.com/ktsaou)) - Add application groups file descriptor limit monitoring (apps.plugin) ([#15417](https://github.com/netdata/netdata/pull/15417), [@ktsaou](https://github.com/ktsaou)) - Re-create sdr cache on start (freeipmi.plugin) ([#15361](https://github.com/netdata/netdata/pull/15361), [@ktsaou](https://github.com/ktsaou)) - Add sensor state chart, create a per-sensor chart instead of a per-sensor dimension (freeipmi.plugin) ([#15327](https://github.com/netdata/netdata/pull/15327), [@ktsaou](https://github.com/ktsaou)) - Expose CmdLine in apps function (apps.plugin) ([#15275](https://github.com/netdata/netdata/pull/15275), [@ilyam8](https://github.com/ilyam8)) - Remove pod_uid and container_id labels in k8s (cgroups.plugin) ([#15216](https://github.com/netdata/netdata/pull/15216), [@ilyam8](https://github.com/ilyam8)) - Add cluster mode (go.d/elasticsearch) ([#1227](https://github.com/netdata/go.d.plugin/pull/1227), [@ilyam8](https://github.com/ilyam8)) - Add 'fallback_type' config option to match Untyped (go.d/prometheus) ([#1225](https://github.com/netdata/go.d.plugin/pull/1225), [@ilyam8](https://github.com/ilyam8)) #### Bug fixes - Fix sensor state updates (freeipmi.plugin) ([#15360](https://github.com/netdata/netdata/pull/15360), [@ilyam8](https://github.com/ilyam8)) - Fix tc.plugin charts labels (tc.plugin) ([#15262](https://github.com/netdata/netdata/pull/15262), [@ilyam8](https://github.com/ilyam8)) - Fix collecting hostgroup from stats_mysql_connection_pool (go.d/proxysql) ([#1226](https://github.com/netdata/go.d.plugin/pull/1226), [@ilyam8](https://github.com/ilyam8)) #### Other - Add eBPF Functions to enable/disable threads (ebpf.plugin) ([#15214](https://github.com/netdata/netdata/pull/15214), [@thiagoftsm](https://github.com/thiagoftsm)) - Hide eBPF functions (ebpf.plugin) ([#15404](https://github.com/netdata/netdata/pull/15404), [@thiagoftsm](https://github.com/thiagoftsm)) - Add profile.plugin ([#13962](https://github.com/netdata/netdata/pull/13962), [@vkalintiris](https://github.com/vkalintiris)) ### Documentation <a id="v1410-contributions-documentation"></a> - Add link for netdata cloud and sign-in cta ([#15431](https://github.com/netdata/netdata/pull/15431), [@andrewm4894](https://github.com/andrewm4894)) - Update Netdata logo in README.md ([#15424](https://github.com/netdata/netdata/pull/15424), [@christophidesp](https://github.com/christophidesp)) - Fix a typo in health.d/consul.conf ([#15419](https://github.com/netdata/netdata/pull/15419), [@Ancairon](https://github.com/Ancairon)) - Add reference to CNCF ([#15408](https://github.com/netdata/netdata/pull/15408), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix instructions on how to determine which installation method to use ([#15351](https://github.com/netdata/netdata/pull/15351), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Update the default Docker installation to provide the full feature set ([#15339](https://github.com/netdata/netdata/pull/15339), [@ilyam8](https://github.com/ilyam8)) - Fix swapped use of volume/bind mount in Docker readme ([#15298](https://github.com/netdata/netdata/pull/15298), [@Ancairon](https://github.com/Ancairon)) - Add Streaming and replication doc ([#15297](https://github.com/netdata/netdata/pull/15297), [@Ancairon](https://github.com/Ancairon)) - Update "health enabled by default" description in stream.conf ([#15291](https://github.com/netdata/netdata/pull/15291), [@ilyam8](https://github.com/ilyam8)) - Remove extra parenthesis from doc ([#15290](https://github.com/netdata/netdata/pull/15290), [@Ancairon](https://github.com/Ancairon)) - Merge spaces, war rooms and invite your team to one place ([#15289](https://github.com/netdata/netdata/pull/15289), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Fix mistype for 'send automatic labels' Prometheus option ([#15282](https://github.com/netdata/netdata/pull/15282), [@k0ste](https://github.com/k0ste)) - Small readme improvements ([#15270](https://github.com/netdata/netdata/pull/15270), [@andrewm4894](https://github.com/andrewm4894)) - Update pfsense.md package install instructions ([#15250](https://github.com/netdata/netdata/pull/15250), [@MYanello](https://github.com/MYanello)) - Add RocketChat cloud integration docs ([#15205](https://github.com/netdata/netdata/pull/15205), [@car12o](https://github.com/car12o)) ### Packaging / Installation <a id="v1410-contributions-packaging"></a> - Update v2 dashboard to v6.21.3 ([#15448](https://github.com/netdata/netdata/pull/15448), [@ilyam8](https://github.com/ilyam8)) - Fix arch detection in static install update ([#15396](https://github.com/netdata/netdata/pull/15396), [@ilyam8](https://github.com/ilyam8)) - Add missing files to web/gui/Makefile.am. ([#15383](https://github.com/netdata/netdata/pull/15383), [@Ferroin](https://github.com/Ferroin)) - Build optimizations ([#15381](https://github.com/netdata/netdata/pull/15381), [@tkatsoulas](https://github.com/tkatsoulas)) - Update libbpf to v1.2.2 ([#15373](https://github.com/netdata/netdata/pull/15373), [@thiagoftsm](https://github.com/thiagoftsm)) - Update go.d.plugin to v0.54.0 ([#15312](https://github.com/netdata/netdata/pull/15312), [@ilyam8](https://github.com/ilyam8)) - Only try to enable _FORTIFY_SOURCE if the user has not disabled optimizations ([#15284](https://github.com/netdata/netdata/pull/15284), [@Ferroin](https://github.com/Ferroin)) - Assorted kickstart script improvements ([#15243](https://github.com/netdata/netdata/pull/15243), [@Ferroin](https://github.com/Ferroin)) - Fix file permissions under directory ([#15208](https://github.com/netdata/netdata/pull/15208), [@stelfrag](https://github.com/stelfrag)) - Add configuration file for netdata-updater.sh ([#15149](https://github.com/netdata/netdata/pull/15149), [@Ferroin](https://github.com/Ferroin)) - Add hardening options to CFLAGS by default if they are available ([#15087](https://github.com/netdata/netdata/pull/15087), [@Ferroin](https://github.com/Ferroin)) - Consistently start the agent as root and rely on it to drop privileges properly ([#14890](https://github.com/netdata/netdata/pull/14890), [@Ferroin](https://github.com/Ferroin)) - Add support for openSUSE tumbleweed ([#14692](https://github.com/netdata/netdata/pull/14692), [@tkatsoulas](https://github.com/tkatsoulas)) ### Health <a id="v1410-contributions-health"></a> - Removing some critical thresholds ([#15124](https://github.com/netdata/netdata/pull/15124), [@M4itee](https://github.com/M4itee)) - Fix evaluating expression with `nan` ([#15348](https://github.com/netdata/netdata/pull/15348), [@ilyam8](https://github.com/ilyam8)) - Respect overriding nc binary for IRC notifications ([#15310](https://github.com/netdata/netdata/pull/15310), [@ilyam8](https://github.com/ilyam8)) - Keep health log history in seconds ([#15314](https://github.com/netdata/netdata/pull/15314), [@MrZammler](https://github.com/MrZammler)) - Fix windows alarms for virtual nodes ([#15376](https://github.com/netdata/netdata/pull/15376), [@ilyam8](https://github.com/ilyam8)) ### Exporting <a id="v1410-contributions-exporting"></a> - Hide not available for viewers charts when exporting in the shell format ([#15309](https://github.com/netdata/netdata/pull/15309), [@ilyam8](https://github.com/ilyam8)) - Fix slow exporting in Prometheus format ([#15276](https://github.com/netdata/netdata/pull/15276), [@ilyam8](https://github.com/ilyam8)) ### Other Notable Changes <a id="v1410-contributions-other"></a> #### Improvements - Enrichment of /api/v2, buildinfo improvements and code cleanup ([#15294](https://github.com/netdata/netdata/pull/15294), [@ktsaou](https://github.com/ktsaou)) #### Bug fixes - Fix unlocked registry access and add hostname to search response ([#15426](https://github.com/netdata/netdata/pull/15426), [@ktsaou](https://github.com/ktsaou)) - Fix interpreting encoded URLs ([#15422](https://github.com/netdata/netdata/pull/15422), [@MrZammler](https://github.com/MrZammler)) - Fix compilation on BSD ([#15331](https://github.com/netdata/netdata/pull/15331), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix virtual hosts showing up as stale nodes ([#15313](https://github.com/netdata/netdata/pull/15313), [@ktsaou](https://github.com/ktsaou)) - Fix clean up of charts generated by external plugins ([#15307](https://github.com/netdata/netdata/pull/15307), [@stelfrag](https://github.com/stelfrag)) - Fix crash when opening Alarms Log tab on the parent instance ([#15306](https://github.com/netdata/netdata/pull/15306), [@MrZammler](https://github.com/MrZammler)) - Fix infinite loop in webserver ([#15287](https://github.com/netdata/netdata/pull/15287), [@ktsaou](https://github.com/ktsaou)) #### Code organization - Add chart id and name to alert instances and transitions ([#15430](https://github.com/netdata/netdata/pull/15430), [@ktsaou](https://github.com/ktsaou)) - Use real-time clock for http response headers ([#15421](https://github.com/netdata/netdata/pull/15421), [@ktsaou](https://github.com/ktsaou)) - Pre release fixes ([#15405](https://github.com/netdata/netdata/pull/15405), [@ktsaou](https://github.com/ktsaou)) - Add expiration to bearer token response ([#15392](https://github.com/netdata/netdata/pull/15392), [@ktsaou](https://github.com/ktsaou)) - Fix CodeQL alert ([#15384](https://github.com/netdata/netdata/pull/15384), [@stelfrag](https://github.com/stelfrag)) - Update http response code descriptions ([#15379](https://github.com/netdata/netdata/pull/15379), [@ktsaou](https://github.com/ktsaou)) - Suppress H2O compilation warnings ([#15378](https://github.com/netdata/netdata/pull/15378), [@stelfrag](https://github.com/stelfrag)) - Fix coverity issues ([#15375](https://github.com/netdata/netdata/pull/15375), [@stelfrag](https://github.com/stelfrag)) - Dont log error on opening .environment ([#15371](https://github.com/netdata/netdata/pull/15371), [@ilyam8](https://github.com/ilyam8)) - Rename log_access and log_health ([#15368](https://github.com/netdata/netdata/pull/15368), [@MrZammler](https://github.com/MrZammler)) - Agent alert notifications redirect ([#15350](https://github.com/netdata/netdata/pull/15350), [@ktsaou](https://github.com/ktsaou)) - Bearer protection - additions ([#15349](https://github.com/netdata/netdata/pull/15349), [@ktsaou](https://github.com/ktsaou)) - Bearer improvements ([#15342](https://github.com/netdata/netdata/pull/15342), [@ktsaou](https://github.com/ktsaou)) - Add hostnames and items statistics to alerts_transitions outputs ([#15329](https://github.com/netdata/netdata/pull/15329), [@ktsaou](https://github.com/ktsaou)) - Use spinlock in host and chart ([#15328](https://github.com/netdata/netdata/pull/15328), [@stelfrag](https://github.com/stelfrag)) - Fix coverity issue 394862 - Argument cannot be negative ([#15324](https://github.com/netdata/netdata/pull/15324), [@stelfrag](https://github.com/stelfrag)) - Rename log Macros (debug) ([#15322](https://github.com/netdata/netdata/pull/15322), [@thiagoftsm](https://github.com/thiagoftsm)) - Bearer authorization API ([#15321](https://github.com/netdata/netdata/pull/15321), [@ktsaou](https://github.com/ktsaou)) - Fix not using host prefix in read_cmdline in read_cmdline() ([#15320](https://github.com/netdata/netdata/pull/15320), [@ilyam8](https://github.com/ilyam8)) - Update local-listener to use libnetdata ([#15319](https://github.com/netdata/netdata/pull/15319), [@ktsaou](https://github.com/ktsaou)) - Avoid memory allocations for alert transitions facets processing ([#15318](https://github.com/netdata/netdata/pull/15318), [@ktsaou](https://github.com/ktsaou)) - Add summary linking to alert instances (ati) when options=summary,values is requested ([#15317](https://github.com/netdata/netdata/pull/15317), [@ktsaou](https://github.com/ktsaou)) - Fix alerts transitions sorting ([#15315](https://github.com/netdata/netdata/pull/15315), [@ktsaou](https://github.com/ktsaou)) - Change info to netdata_log_info in sqlite_db_migration.c ([#15303](https://github.com/netdata/netdata/pull/15303), [@MrZammler](https://github.com/MrZammler)) - Change query to store host system info values ([#15300](https://github.com/netdata/netdata/pull/15300), [@MrZammler](https://github.com/MrZammler)) - Change info to netdata_log_info in profile.plugin ([#15299](https://github.com/netdata/netdata/pull/15299), [@vkalintiris](https://github.com/vkalintiris)) - Rename generic `error` function ([#15296](https://github.com/netdata/netdata/pull/15296), [@thiagoftsm](https://github.com/thiagoftsm)) - Optimizations part 3 ([#15293](https://github.com/netdata/netdata/pull/15293), [@ktsaou](https://github.com/ktsaou)) - Send alert chart labels config key to cloud ([#15283](https://github.com/netdata/netdata/pull/15283), [@MrZammler](https://github.com/MrZammler)) - Optimizations part 2 ([#15280](https://github.com/netdata/netdata/pull/15280), [@ktsaou](https://github.com/ktsaou)) - Misc alert fixes ([#15274](https://github.com/netdata/netdata/pull/15274), [@MrZammler](https://github.com/MrZammler)) - Replace `info` macro with a less generic name ([#15266](https://github.com/netdata/netdata/pull/15266), [@carlocab](https://github.com/carlocab)) - Rewrite /api/v2/alerts ([#15257](https://github.com/netdata/netdata/pull/15257), [@ktsaou](https://github.com/ktsaou)) - Use gperf for the pluginsd/streaming parser hashtable ([#15251](https://github.com/netdata/netdata/pull/15251), [@ktsaou](https://github.com/ktsaou)) - URL rewrite at the agent web server to support multiple dashboard versions ([#15247](https://github.com/netdata/netdata/pull/15247), [@ktsaou](https://github.com/ktsaou)) - Fix coverity 393183 & 393182 ([#15234](https://github.com/netdata/netdata/pull/15234), [@MrZammler](https://github.com/MrZammler)) - Create index for health log migration ([#15233](https://github.com/netdata/netdata/pull/15233), [@stelfrag](https://github.com/stelfrag)) - New alerts endpoint ([#15232](https://github.com/netdata/netdata/pull/15232), [@stelfrag](https://github.com/stelfrag)) - Various /api/v2 improvements ([#15227](https://github.com/netdata/netdata/pull/15227), [@ktsaou](https://github.com/ktsaou)) - Relax jnfv2 caching ([#15224](https://github.com/netdata/netdata/pull/15224), [@ktsaou](https://github.com/ktsaou)) - Fix /api/v2/contexts,nodes,nodes_instances,q before match ([#15223](https://github.com/netdata/netdata/pull/15223), [@ktsaou](https://github.com/ktsaou)) - Add recursive readers support to RW_SPINLOCK ([#15217](https://github.com/netdata/netdata/pull/15217), [@ktsaou](https://github.com/ktsaou)) - Allow overriding pipename from env ([#15215](https://github.com/netdata/netdata/pull/15215), [@vkalintiris](https://github.com/vkalintiris)) - Memory reductions and optimizations ([#15204](https://github.com/netdata/netdata/pull/15204), [@ktsaou](https://github.com/ktsaou)) - Agent dashboard reorganization ([#15200](https://github.com/netdata/netdata/pull/15200), [@Ferroin](https://github.com/Ferroin)) - Add two functions that allow someone to start/stop ML ([#15185](https://github.com/netdata/netdata/pull/15185), [@vkalintiris](https://github.com/vkalintiris)) - Add streaming function and various improvements to /api/v2/nodes ([#15168](https://github.com/netdata/netdata/pull/15168), [@ktsaou](https://github.com/ktsaou)) - Use a single health log table ([#15157](https://github.com/netdata/netdata/pull/15157), [@MrZammler](https://github.com/MrZammler)) - Redirect to index.html when a file is not found by web server ([#15143](https://github.com/netdata/netdata/pull/15143), [@MrZammler](https://github.com/MrZammler)) - Additional CO-RE code (eBPF.plugin) ([#15078](https://github.com/netdata/netdata/pull/15078), [@thiagoftsm](https://github.com/thiagoftsm)) ## Deprecation notice <a id="v1410-deprecation-notice"></a> There is not an obvious list of items that will be deprecated in the upcoming release (v1.42.0). Feel free to check and elaborate on the [upcoming backlog](https://github.com/netdata/netdata#whats-new-and-coming) ### Deprecated in this release <a id="v1410-deprected-in-this-release"></a> In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.40.0#v1400-deprecation-notice), the following items in this release: | Component | Type | Will be replaced by | |--------------------------------------------------------------------------------------------------------------|:----------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:| | [python.d/nvidia_smi](https://github.com/netdata/netdata/tree/v1.37.1/collectors/python.d.plugin/nvidia_smi) | collector | [go.d/nvidia_smi](https://github.com/netdata/go.d.plugin/tree/master/modules/nvidia_smi) | | `family` attribute | alert configuration and Health API | [chart labels](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-chart-labels) attribute (more details on [netdata#15030](https://github.com/netdata/netdata/issues/15030)) | ## Netdata Release Meetup <a id="v1410-netdata-release-meetup"></a> Join the Netdata team on the **21st of July at 17:00 UTC** for the [Netdata Release Meetup](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/294882479/). Together we’ll cover: - Release Highlights. - Acknowledgements. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata-infrastructure-monitoring-meetup-group/events/294178290) - we look forward to meeting you. ## Support options <a id="v1410-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1400 engineers are already using it! 2023-07-19T21:37:16+00:00 netdata v1.42.0 netdata v1.42.0 2023-08-09T17:29:50+00:00 - [Netdata Growth ](#v1420-netdata-open-source-growth) - [Release Highlights](#v1420-release-highlights) - [Integrations Marketplace](#v1420-integrations-marketplace) - [SystemD Journal](#v1420-systemd-journal) - [Claiming via the UI](#v1420-claiming-via-the-ui) - [Easily Spot Anomalies](#v1420-easily-spot-anomalies) - [Acknowledgements ](#v1420-acknowledgements) - [Contributions](#v1420-contributions) - [Collectors](#v1420-contributions-collectors) - [Documentation ](#v1420-contributions-documentation) - [Packaging/Installation](#v1420-contributions-packaging) - [Health](#v1420-contributions-health) - [Other Notable Changes](#v1420-contributions-other) - [Deprecation notice](#v1420-deprecation-notice) - [Netdata Release Meetup](#v1420-netdata-release-meetup) - [Support options](#v1420-support-options) Steady to our schedule, this is another great Netdata release! ## Netdata Growth <a id="v1420-netdata-open-source-growth"></a> - **64.5 k GitHub Stars ⭐** Netdata got at the top trending repos on GitHub, after the last release. ❤️ Thank you for your love! 🚀 You rock! > **[Give Netdata a ⭐ on GitHub too!](https://github.com/netdata/netdata)** - **580+ M docker hub pulls, running at 200+ k per day.** Netdata is a verified publisher on Docker Hub, and our users enjoy free unlimited Docker Hub pulls! ## Release Highlights <a id="v1420-release-highlights"></a> ### Integrations Marketplace <a id="v1420-integrations-marketplace"></a> A beta version of the Netdata Marketplace is included in this release: ![image](https://github.com/netdata/netdata/assets/2662304/bc9ba52d-363c-4381-babd-4f85592ed745) More than 800 integrations are available, directly from the dashboard. For each integration, all the information required to get it up and running is included: ![2023-08-08 15-36-40](https://github.com/netdata/netdata/assets/2662304/53e3076d-ee1c-4e36-b1ac-ef5e6caf28ff) Integrations are still in beta. We improve it every day, but we think it is already quite useful. ### SystemD Journal <a id="v1420-systemd-journal"></a> A new Netdata Function has been added to query the **systemd journal** logs: ![2023-08-08 16-04-49](https://github.com/netdata/netdata/assets/2662304/a8f2c049-b1ae-44cf-abbd-8e3a00ac95bd) The function respects the current date-time picker, so it can query any possible timeframe the systemd journal has data for. > **IMPORTANT**<br/> > Netdata Functions are available only when you are signed in to Netdata and your Netdata Agent is claimed. > This has been done to protect your privacy. Netdata Cloud checks that the users of the Agent dashboard are allowed to view this information. > **IMPORTANT**<br/> > The `systemd-journal` function is currently available only on Netdata Agents that have been installed from source, or with native packages of the Linux distribution (RPM, DEB). For users running static builds of Netdata or running Netdata in a Docker container, we are working to bring `systemd-journal` to them too. Stay tuned... ### Claiming via the UI <a id="v1420-claiming-via-the-ui"></a> You can now connect your agents to Netdata Cloud, via the dashboard: ![2023-08-08 15-53-30](https://github.com/netdata/netdata/assets/2662304/91275928-1bba-4824-ab65-42a968d4f4f0) The UI verifies that you are the owner of a Netdata, by asking you to provide a random key that is saved to a file on disk. Once you provide the right key, Netdata is automatically claimed to your space at Netdata Cloud. ### Easily Spot Anomalies <a id="v1420-easily-spot-anomalies"></a> The UI has an `AR` button above the menu. When you press it, the dashboard queries the Netdata Metrics Scoring Engine, to find the anomaly rates for the visible timeframe, across the metrics included in the dashboard. Then it add a badge next to each category and subcategory, showing its anomaly rate. This way, you can quickly spot what is anomalous on the current view of the dashboard. ![2023-08-08 16-25-44](https://github.com/netdata/netdata/assets/2662304/0d4d93f1-b471-4c8b-b308-198b6b57c7ef) ## Acknowledgements <a id="v1420-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @Leny1996 for fixing Docker bind-mount stock files creation. - @fhriley for adding Linux power cap Intel RAPL metrics collector. - @icy17 for fixing potential crash in the h2o server. - @kiela for fixing typos and images placement in the Deployment Strategies doc. - @zeylos for fixing non-interactive options for apt-get and zypper. ## Contributions <a id="v1420-contributions"></a> ### Collectors <a id="v1420-contributions-collectors"></a> ### New - Add AMD GPU collector (proc.plugin)([#15515](https://github.com/netdata/netdata/pull/15515), [@Dim-P](https://github.com/Dim-P)) - Add PCI Advanced Error Reporting metrics collector (proc.plugin) ([#15488](https://github.com/netdata/netdata/pull/15488), [@ktsaou](https://github.com/ktsaou)) - Add Linux power cap Intel RAPL metrics collector (proc.plugin) ([#15364](https://github.com/netdata/netdata/pull/15364), [@fhriley](https://github.com/fhriley)) - Add systemd-journal plugin (systemd-journal.plugin)([#15363](https://github.com/netdata/netdata/pull/15363), [@ktsaou](https://github.com/ktsaou)) ### Improvements - Collect EDAC metrics per-memory controller (MC) and DIMM (proc.plugin) ([#15473](https://github.com/netdata/netdata/pull/15473), [@ktsaou](https://github.com/ktsaou)) ### Bug fixes - Fix power readings for new drivers (python.d/nvidia_smi) ([#15755](https://github.com/netdata/netdata/issues/15755), [@ilyam8](https://github.com/ilyam8)) - Dont log if pressure/irq does not exist (proc.plugin) ([#15732](https://github.com/netdata/netdata/pull/15732), [@ilyam8](https://github.com/ilyam8)) - Fix no data after 24 hours (ebpf.plugin) ([#15694](https://github.com/netdata/netdata/pull/15694), [@thiagoftsm](https://github.com/thiagoftsm)) - Disable freeipmi.plugin in Docker by default (freeipmi.plugin) ([#15651](https://github.com/netdata/netdata/pull/15651), [@ilyam8](https://github.com/ilyam8)) - Fix system swap calls (ebpf.plugin) ([#15553](https://github.com/netdata/netdata/pull/15553), [@ilyam8](https://github.com/ilyam8)) - Fix keepalive (freeipmi.plugin) ([#15499](https://github.com/netdata/netdata/pull/15499), [@ilyam8](https://github.com/ilyam8)) - Fix wrong logging about FD limits (apps.plugin) ([#15467](https://github.com/netdata/netdata/pull/15467), [@ktsaou](https://github.com/ktsaou)) ### Other - Change restart message to info (freeipmi.plugin) ([#15664](https://github.com/netdata/netdata/pull/15664), [@ilyam8](https://github.com/ilyam8)) - Filter out systemd-udevd.service/udevd cgroup (cgroups.plugin) ([#15571](https://github.com/netdata/netdata/pull/15571), [@ilyam8](https://github.com/ilyam8)) - Improve FD limit issue tracing (apps.plugin) ([#15504](https://github.com/netdata/netdata/pull/15504), [@ktsaou](https://github.com/ktsaou)) - Add hash table charts for internal monitoring (ebpf.plugin) ([#15323](https://github.com/netdata/netdata/pull/15323), [@thiagoftsm](https://github.com/thiagoftsm)) ### Documentation <a id="v1420-contributions-documentation"></a> - Fix spelling errors in README.md ([#15658](https://github.com/netdata/netdata/pull/15658), [@ilyam8](https://github.com/ilyam8)) - Fix typos and images placement in the Deployment Strategies doc ([#15606](https://github.com/netdata/netdata/pull/15606), [@kiela](https://github.com/kiela)) - Improve spelling in README.md ([#15601](https://github.com/netdata/netdata/pull/15601), [@tkatsoulas](https://github.com/tkatsoulas)) - Improve emojis in README.md ([#15583](https://github.com/netdata/netdata/pull/15583), [@andrewm4894](https://github.com/andrewm4894)) - Fix apps.plugin fd badges and typos ([#15539](https://github.com/netdata/netdata/pull/15539), [@ilyam8](https://github.com/ilyam8)) - Change api.netdata.cloud to app.netdata.cloud ([#15538](https://github.com/netdata/netdata/pull/15538), [@ilyam8](https://github.com/ilyam8)) - Change nvidia_smi link to go version in COLLECTORS.md ([#15536](https://github.com/netdata/netdata/pull/15536), [@Ancairon](https://github.com/Ancairon)) - Add `diskquota` collector to third party collectors list ([#15524](https://github.com/netdata/netdata/pull/15524), [@andrewm4894](https://github.com/andrewm4894)) - Clarify health percentage option ([#15492](https://github.com/netdata/netdata/pull/15492), [@ilyam8](https://github.com/ilyam8)) - Fix links to Agent Dashboard ([#15479](https://github.com/netdata/netdata/pull/15479), [@Ancairon](https://github.com/Ancairon)) - Note that health foreach works only with template ([#15478](https://github.com/netdata/netdata/pull/15478), [@ilyam8](https://github.com/ilyam8)) - Overhaul deployment strategies documentation ([#15464](https://github.com/netdata/netdata/pull/15464), [@ralphm](https://github.com/ralphm)) - Reorder cols in "what's new" and add links in README.md ([#15455](https://github.com/netdata/netdata/pull/15455), [@andrewm4894](https://github.com/andrewm4894)) - Update netdata-functions.md ([#14441](https://github.com/netdata/netdata/pull/14441), [@shyamvalsan](https://github.com/shyamvalsan)) ### Packaging / Installation <a id="v1420-contributions-packaging"></a> - Add dependencies for systemd journal plugin ([#15747](https://github.com/netdata/netdata/pull/15747), [@Ferroin](https://github.com/Ferroin)) - Prefer capability over setuid for systemd-journal in installer ([#15741](https://github.com/netdata/netdata/pull/15741), [@ilyam8](https://github.com/ilyam8)) - Add netdata-plugin-systemd-journal package ([#15733](https://github.com/netdata/netdata/pull/15733), [@Ferroin](https://github.com/Ferroin)) - Fix systemd-journal makefile ([#15727](https://github.com/netdata/netdata/pull/15727), [@ktsaou](https://github.com/ktsaou)) - Update go.d.plugin to v0.54.1 ([#15692](https://github.com/netdata/netdata/pull/15692), [@ilyam8](https://github.com/ilyam8)) - Fix edit-config for containerized Netdata when running from host ([#15641](https://github.com/netdata/netdata/pull/15641), [@ilyam8](https://github.com/ilyam8)) - Fix Docker bind-mount stock files creation ([#15639](https://github.com/netdata/netdata/pull/15639), [@Leny1996](https://github.com/Leny1996)) - Add a machine distinct id to analytics ([#15485](https://github.com/netdata/netdata/pull/15485), [@MrZammler](https://github.com/MrZammler)) - Drop support for native packages of Ubuntu 22.10 ([#15292](https://github.com/netdata/netdata/pull/15292), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix non-interactive options for apt-get and zypper ([#15288](https://github.com/netdata/netdata/pull/15288), [@zeylos](https://github.com/zeylos)) ### Health <a id="v1420-contributions-health"></a> - Disable systemdunits alarms ([#15726](https://github.com/netdata/netdata/pull/15726), [@ilyam8](https://github.com/ilyam8)) - Remove the noise by silencing alerts that don't need to wake up people ([#15590](https://github.com/netdata/netdata/pull/15590), [@ktsaou](https://github.com/ktsaou)) ### Other Notable Changes <a id="v1420-contributions-other"></a> ### Improvements - Add support for SNI and chunking to ACLK ([#15739](https://github.com/netdata/netdata/pull/15739), [@underhood](https://github.com/underhood)) - Prefer titles, families, units and priorities from collected charts ([#15614](https://github.com/netdata/netdata/pull/15614), [@ktsaou](https://github.com/ktsaou)) - Speed up AR calculation ([#15595](https://github.com/netdata/netdata/pull/15595), [@ktsaou](https://github.com/ktsaou)) ### Bug Fixes - Fix memory corruption ([#15724](https://github.com/netdata/netdata/pull/15724), [@stelfrag](https://github.com/stelfrag)) - Fix CPU frequency calculation when using /proc/cpuinfo in system-info.sh ([#15584](https://github.com/netdata/netdata/pull/15584), [@ilyam8](https://github.com/ilyam8)) - Allow creating alert hashes with --disable-cloud ([#15519](https://github.com/netdata/netdata/pull/15519), [@MrZammler](https://github.com/MrZammler)) - Allow manage/health API call without bearer ([#15503](https://github.com/netdata/netdata/pull/15503), [@MrZammler](https://github.com/MrZammler)) - Fix the calculation of incremental-sum ([#15468](https://github.com/netdata/netdata/pull/15468), [@ktsaou](https://github.com/ktsaou)) ### Code organization - Faster facets and journal fixes ([#15737](https://github.com/netdata/netdata/pull/15737), [@ktsaou](https://github.com/ktsaou)) - Adjust namespace used for sd_journal_open ([#15736](https://github.com/netdata/netdata/pull/15736), [@stelfrag](https://github.com/stelfrag)) - Fix the freez pointer of dyncfg ([#15719](https://github.com/netdata/netdata/pull/15719), [@ktsaou](https://github.com/ktsaou)) - Better cleanup of aclk alert table entries ([#15706](https://github.com/netdata/netdata/pull/15706), [@MrZammler](https://github.com/MrZammler)) - Fix potential crash bug in h2o server ([#15605](https://github.com/netdata/netdata/pull/15605), [@icy17](https://github.com/icy17)) - Fix the health query that fetches the maximum unique id ([#15589](https://github.com/netdata/netdata/pull/15589), [@stelfrag](https://github.com/stelfrag)) - Add missing file in CMakeLists.txt ([#15574](https://github.com/netdata/netdata/pull/15574), [@stelfrag](https://github.com/stelfrag)) - Drop duplicate / unused index ([#15568](https://github.com/netdata/netdata/pull/15568), [@stelfrag](https://github.com/stelfrag)) - Detect the path where the netdata-claim.sh script is located ([#15556](https://github.com/netdata/netdata/pull/15556), [@ktsaou](https://github.com/ktsaou)) - Fix expiration dates for API responses ([#15546](https://github.com/netdata/netdata/pull/15546), [@ktsaou](https://github.com/ktsaou)) - Add cloud status in registry?action=hello ([#15530](https://github.com/netdata/netdata/pull/15530), [@ktsaou](https://github.com/ktsaou)) - Wait for node_id while claiming ([#15526](https://github.com/netdata/netdata/pull/15526), [@ktsaou](https://github.com/ktsaou)) - Avoid an extra uuid_copy when creating new MRG entries ([#15502](https://github.com/netdata/netdata/pull/15502), [@stelfrag](https://github.com/stelfrag)) - Fix resource leak - CID 396310 ([#15491](https://github.com/netdata/netdata/pull/15491), [@stelfrag](https://github.com/stelfrag)) - Improve update of the alert chart name in the database ([#15490](https://github.com/netdata/netdata/pull/15490), [@stelfrag](https://github.com/stelfrag)) - Dynamic Config MVP0 ([#15486](https://github.com/netdata/netdata/pull/15486), [@underhood](https://github.com/underhood)) - Fix coverity issue ([#15475](https://github.com/netdata/netdata/pull/15475), [@stelfrag](https://github.com/stelfrag)) - Store and transmit chart_name to cloud in alert events ([#15441](https://github.com/netdata/netdata/pull/15441), [@MrZammler](https://github.com/MrZammler)) ## Deprecation notice <a id="v1420-deprecation-notice"></a> We plan to change the following items in the next release (v1.43.0): | Component | Type | Change | Action | |--------------------------------------------------------------------------------------------------------------|------------------------------------|--------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------:| | [apps.plugin](https://github.com/netdata/netdata/tree/master/collectors/apps.plugin) | collector | a dimension for each group/user/user group => a chart for each group/user/user group | | | [cgroups.plugin](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin) | collector | a dimension for each systemd service => a chart for each systemd service | | | [proc.plugin](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin) | collector | all "Networking Stack" metrics except "tcp" => "IPv4 Networking" | | | [python.d/nvidia_smi](https://github.com/netdata/netdata/tree/v1.41.0/collectors/python.d.plugin/nvidia_smi) | collector | deprecated | use [go.d/nvidia_smi](https://github.com/netdata/go.d.plugin/tree/master/modules/nvidia_smi) | | `family` attribute | alert configuration and Health API | deprecated | use [chart labels](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-chart-labels) | ## Netdata Release Meetup <a id="v1420-netdata-release-meetup"></a> Join the Netdata team on the **11th of August at 17:00 UTC** for the [Netdata Release Meetup](https://www.meetup.com/netdata/events/294900951/). Together we’ll cover: - Release Highlights. - Acknowledgements. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata/events/294900951/) - we look forward to meeting you. ## Support options <a id="v1420-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1600 engineers are already using it! 2023-08-09T17:29:50+00:00 netdata v1.42.1 netdata v1.42.1 2023-08-16T16:49:33+00:00 Netdata v1.42.1 is a patch release to address issues discovered since [v1.42.0](https://github.com/netdata/netdata/releases/tag/v1.42.0). This patch release provides the following bug fixes and updates: - Fixed issue with missing entries for Systemd-journal and Processes functions ([#15814](https://github.com/netdata/netdata/pull/15814), [@ktsaou](https://github.com/ktsaou)) - Fixed linking health.log to stdout in Docker ([#15813](https://github.com/netdata/netdata/pull/15813), [@ilyam8](https://github.com/ilyam8)) - Updated UI version to v6.28.0 ([#15810](https://github.com/netdata/netdata/pull/15810), [@ilyam8](https://github.com/ilyam8)) - Fixed 401 when behind a proxy with Basic auth and signed in ([#15808](https://github.com/netdata/netdata/pull/15808), [@ktsaou](https://github.com/ktsaou)) - Fixed Health Management API ([#15806](https://github.com/netdata/netdata/pull/15806), [@underhood](https://github.com/underhood)) - Fixed build deps in DEB packages for systemd-journal.plugin ([#15805](https://github.com/netdata/netdata/pull/15805), [@Ferroin](https://github.com/Ferroin)) - Cleaned up python deps for RPM packages ([#15804](https://github.com/netdata/netdata/pull/15804), [@Ferroin](https://github.com/Ferroin)) - Added proper SUID fallback for DEB plugin packages ([#15803](https://github.com/netdata/netdata/pull/15803), [@Ferroin](https://github.com/Ferroin)) - Fixed an issue where the nd_journal_process column was not populated for the Systemd-journal function ([#15798](https://github.com/netdata/netdata/pull/15798), [@ktsaou](https://github.com/ktsaou)) - Fixed negative retention when database is empty in /api/v2/info ([#15796](https://github.com/netdata/netdata/pull/15796), [@ktsaou](https://github.com/ktsaou)) - Fixed handling of unassigned drives for python.d/hpssa ([#15793](https://github.com/netdata/netdata/pull/15793), [@ilyam8](https://github.com/ilyam8)) - Fixed an issue that prevented systemd-journal.plugin from restarting ([#15787](https://github.com/netdata/netdata/pull/15787), [@ktsaou](https://github.com/ktsaou)) - Fixed publishing of openSUSE 15.5 packages ([#15781](https://github.com/netdata/netdata/pull/15781), [@tkatsoulas](https://github.com/tkatsoulas)) - Updated OpenSSL version of static builds to 1.1.1v ([#15779](https://github.com/netdata/netdata/pull/15779), [@tkatsoulas](https://github.com/tkatsoulas)) ## Support options As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1600 engineers are already using it! 2023-08-16T16:49:33+00:00 netdata v1.42.2 netdata v1.42.2 2023-08-28T15:57:59+00:00 Netdata v1.42.2 is a patch release to address issues discovered since [v1.42.1](https://github.com/netdata/netdata/releases/tag/v1.42.1). This patch release provides the following bug fixes and updates: - Fixed plugins dependency in native packages ([#15861](https://github.com/netdata/netdata/pull/15861), [@Ferroin](https://github.com/Ferroin)) - Added an option to avoid duplicate labels when exporting in Prometheus format ([#15860](https://github.com/netdata/netdata/pull/15860), [@kevin-fwu](https://github.com/kevin-fwu)) - Fixed RAM cached and used calculation ([#15859](https://github.com/netdata/netdata/pull/15859), [@ilyam8](https://github.com/ilyam8)) - Fixed static build OpenSSL on 32bit ([#15855](https://github.com/netdata/netdata/pull/15855), [@ilyam8](https://github.com/ilyam8)) - Fixed static build release channel for stable version ([#15854](https://github.com/netdata/netdata/pull/15854), [@ilyam8](https://github.com/ilyam8)) - Fixed crash when multiple collectors update the same chart ([#15845](https://github.com/netdata/netdata/pull/15845), [@ktsaou](https://github.com/ktsaou)) - Fixed OpenSSL threading support for static build ([#15842](https://github.com/netdata/netdata/pull/15842), [@ktsaou](https://github.com/ktsaou)) - Updated UI version to v6.29.0 ([#15841](https://github.com/netdata/netdata/pull/15841), [@ilyam8](https://github.com/ilyam8)) - Fixed permission attributes for conf.d dirs for RPM ([#15828](https://github.com/netdata/netdata/pull/15828), [@k0ste](https://github.com/k0ste)) - Fixed resource leak in web API ([#15827](https://github.com/netdata/netdata/pull/15827), [@stelfrag](https://github.com/stelfrag)) - Fixed use after free ([#15825](https://github.com/netdata/netdata/pull/15825), [@stelfrag](https://github.com/stelfrag)) - Fixed DYNCFG thread takes too long to exit ([#15824](https://github.com/netdata/netdata/pull/15824), [@underhood](https://github.com/underhood)) - Increased alert snapshot chunk size ([#15748](https://github.com/netdata/netdata/pull/15748), [@MrZammler](https://github.com/MrZammler)) ## Acknowledgements <a id="v1422-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @kevin-fwu for adding an option to avoid duplicate labels when exporting in Prometheus format. - @k0ste for fixing permission attributes for conf.d dirs for RPM. ## Support options <a id="v1422-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1600 engineers are already using it! 2023-08-28T15:57:59+00:00 netdata v1.42.3 netdata v1.42.3 2023-09-11T16:35:31+00:00 Netdata v1.42.3 is a patch release to address issues discovered since [v1.42.2](https://github.com/netdata/netdata/releases/tag/v1.42.2). This patch release provides the following bug fixes and updates: - Fixed memory leak in Prometheus exporter ([#15929](https://github.com/netdata/netdata/pull/15929), [@ktsaou](https://github.com/ktsaou)). - Fixed handling of closed connections in streaming ([#15771](https://github.com/netdata/netdata/pull/15771), [@moonbreon](https://github.com/moonbreon)). ## Acknowledgements <a id="v1423-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @moonbreon for improving handling of closed connections in streaming. ## Support options <a id="v1423-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-09-11T16:35:31+00:00 netdata v1.42.4 netdata v1.42.4 2023-09-18T15:11:22+00:00 Netdata v1.42.4 is a patch release to address issues discovered since [v1.42.3](https://github.com/netdata/netdata/releases/tag/v1.42.3). This patch release provides the following bug fixes and updates: - Fixed alarm variables not being created for all chart dimensions. ([#15984](https://github.com/netdata/netdata/pull/15984), [@MrZammler](https://github.com/MrZammler)). ## Support options <a id="v1424-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-09-18T15:11:22+00:00 netdata v1.43.0 netdata v1.43.0 2023-10-16T21:00:34+00:00 ## Groundbreaking: `systemd-journal` logs release! ![](https://user-images.githubusercontent.com/2662304/275458055-14257b47-f374-4a58-bff9-d3b217a9421b.gif) ## Table of Contents - [Netdata Growth ](#v1430-netdata-open-source-growth) - [Release Summary](#v1430-release-summary) - [Release Highlights](#v1430-release-highlights) - [systemd journal improvements](#v1430-systemd-journal) - [Virtual Machine monitoring (VMWare vSphere)](#v1430-virtual-machine-monitoring) - [What is coming next](#v1430-coming-next) - [Acknowledgments](#v1430-acknowledgments) - [Contributions](#v1430-contributions) - [Collectors](#v1430-contributions-collectors) - [Packaging/Installation](#v1430-contributions-packaging) - [Documentation ](#v1430-contributions-documentation) - [Health](#v1430-contributions-health) - [Other Notable Changes](#v1430-contributions-other) - [Deprecation notice](#v1430-deprecation-notice) - [Netdata Release Meetup](#v1430-netdata-release-meetup) - [Support options](#v1430-support-options) Steady to our schedule, this is another great Netdata release! ## Netdata Growth <a id="v1430-netdata-open-source-growth"></a> - **65.5 k GitHub Stars ⭐** Since October 2023, Netdata is leading the observability category in the [CNCF landscape](https://landscape.cncf.io/card-mode?category=observability-and-analysis&grouping=no&sort=stars), surpassing Elasticsearch. Thank you for your love ❤️! [Give Netdata a ⭐ too, on GitHub!](https://github.com/netdata/netdata) - **595 M docker hub pulls** Netdata runs with about 200k docker hub downloads per day. Since June 2023 we are a [Verified Publisher](https://hub.docker.com/r/netdata/netdata), so that Netdata pulls don't count against docker hub pull limits for our users, allowing all our users to integrate Netdata to their CI/CD toolchains. ## Release Summary <a id="v1430-release-summary"></a> This release is the most robust and reliable Netdata we have ever built. These are the main areas Netdata has improved since the last release: 1. **Logs** Today we release an almost rewritten version of `systemd-journal`, to improve its performance and visualization capabilities. `systemd-journal` holds critical systems and security information and given the lack of `systemd-journal` visualization tools, we focused first on filling this gap. At the same time, we are standardizing the way logs should be as a part of Netdata, enabling us to support more log management engines, like Loki and Elasticsearch. 2. **Instances Slice and Dice** Given the capabilities of the new Netdata Agent UI (v2), we are changing the way some of our collectors collect and expose metrics, to allow easier slicing and dicing of the data and be more OpenTelemetry compatible in terms of specifications. So, in this release we changed the way `apps.plugin` exposes charts in the `Applications` section of the dashboard. Following the NIDL framework, each application group is now an instance, allowing better aggregation of processes utilization across nodes. Similarly, our `systemd` units charts have been updated to have an instance for each `systemd` unit. For the same reasons, disk charts now have additional labels (`id`, `model` and `serial`) to help us identify disks from the charts. Unfortunately, such changes tend to make the older dashboards (v1, v0) less usable, especially on servers with many hundreds of instances. 3. **Stock Alerts** A number of changes have been implemented to the Netdata Health engine, to allow better integration with the new dashboard. More changes in this area are about to come, as part of the next release: a) allow multi-node alerts on parents, b) allow evaluating and configuring alerts from the UI. 5. **Alerts Accuracy** Netdata has by default 3 tiers of metrics, each with a different resolution. The Netdata query planner is automatically picking the right tier to satisfy a query, based on the number of points requested in the response. For alerts there was a side effect. Since alerts request only 1 point of data in the response, the query planner was picking the "easier" tier to query, which is of course the one with the lower resolution. Now alerts are always run on tier 0, the higher resolution one. 6. **Lower Resources Utilization** Several changes have been implemented for Netdata to better take care of itself. That includes lower memory usage, lower disk footprint, self vacuuming of SQLite databases, and more. Probably the most notable change is that now Netdata needs only 1 pointer (8 bytes on 64 bit, 4 bytes on 32 bit) for each use of a label `name-value` combination. This improves drastically Netdata's memory requirements in setups like busy k8s clusters, that containers come and go all the time, increasing the labels cardinality significantly. 7. **32bit Netdata on 64bit IoT machines** A common request when Netdata is installed on 64bit IoT devices, is to have a 32bit Netdata running there. Before this release, this was not possible. Now a 32bit Netdata will nicely run on a 64bit operating system. 8. **Netdata Cloud on prem** Netdata Cloud is now available to be installed on-prem! Several companies have already deployed it and are currently testing it. If you want to join them, [submit this form](https://www.netdata.cloud/contact-us/?subject=on-prem). ## Release Highlights <a id="v1430-release-highlights"></a> ### `systemd-journal` <a id="v1430-systemd-journal"></a> `systemd-journal` was first included in Netdata v1.42.0. Immediately after release, we recognized the wider need for this feature, so we've rewritten the plugin almost entirely, to provide the best possible experience. This work is also fundamental for supporting more log monitoring integrations - stay tuned! The major improvements done on `systemd-journal` logs function were: * addition of the **histogram** for log entries over time, with a break down per field-value, for any field and any time-frame * enable of the **PLAY** mode provides the same experience as `journalctl -f`, showing new logs entries immediately after they are received * allow filtering on any **journal field** or **field value**, for any time-frame * add support for coloring log entries, the same way `journalctl` does If you want to take a look at a full presentation of the `systemd-journal` plugin, how it works, how you can take full advantage of this and even instructions on configuration of a logs centralization server, check the [documentation for the plugin](https://learn.netdata.cloud/docs/data-collection/systemd-journal). ![chrome_tf8dV0qS5x](https://github.com/netdata/netdata/assets/82235632/d0d7208b-24b9-4192-85dc-5171d75204ec) You can experience the power of `systemd-journal` logs function in one of our Netdata demo rooms [here](https://app.netdata.cloud/spaces/netdata-demo/rooms/all-nodes/functions?oauth=google&_gl=1*1wpvrxe*_ga*MzIyMjg4OTE3LjE2OTcwNDE2NjM.*_ga_J69Z2JCTFB*MTY5NzIxNjg3Ny4xNy4xLjE2OTcyMTY4OTQuNDMuMC4w#after=-21600&before=0&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a-fn-selectedFn-arr=systemd-journal&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a-fn-selectedNodeIds-arr=e3b4cd99-19a7-467b-841a-09314dcafc51&selectedFn-arr=systemd-journal&selectedNodeIds-arr=d8e944dd-d061-4bc9-a850-0ac2ee4ff87f&d8a4e0c5-7c79-4145-900e-83a9f06fcb6a-systemd-journalFilters-source-arr=all) or check our latest [YouTube video](https://www.youtube.com/watch?v=-PLUjVXwC4Q) on it. Want to know why you should untap the full potential of `systemd-journal` logs? Check out Netdata's founder, Costa Tsaousis [@ktsaou](https://github.com/ktsaou), blogpost on it [here](https://blog.netdata.cloud/systemd-journal-logs-a-game-changer-for-devops-and-developers/). ### Virtual Machine monitoring (VMWare vSphere)<a id="v1430-virtual-machine-monitoring"></a> With the increased feedback and requests on VMware vCenter Server collectors we have: * Reviewed our out-of-the-box charts * Added labels to the charts, e.g. `host`, `datacenter`, `cluster`, `vm` * Reviewed the metadata on alerts * Added summary charts section It is with this feedback from the Community that we can keep working on improving Netdata to ensure it meets your needs! ## What is coming next <a id="v1430-coming-next"></a> We are currently working on the following areas, which we hope to release next month: 1. **Logs Explorer for Loki and Elasticsearch** Similar to `systemd-journal`, allow Netdata to explore, query and visualize logs from Loki and Elasticsearch. 2. **Collectors Configuration from the UI** In the last release we presented the Integrations Marketplace. Since then, we work to make all integrations configurable via the dashboard. This will allow all of us to configure our Netdata servers directly from the UI, without touching configuration files, improving significantly the usability and easiness of Netdata. 3. **Alerts Configuration from the UI** Similarly, we work to allow configuring alerts directly from the UI, without text file configurations, so the all of us can create powerful alerts on the spot. 4. **Netdata Mobile App** We are at the final stage of releasing our Netdata Mobile App (iOS and Android) for receiving mobile push notifications and exploring alerts statuses. 5. **Scalability** Given the wide adoption of Netdata, we are committed to make Netdata scale better in larger environments. Especially when it comes to Netdata parents, we aim to provide the best scalability possible. We are currently finalizing the necessary changes to allow Netdata achieve: - 1 CPU core per 1 million metrics/s for data collection - 1 CPU core per 1 million metrics/s for ML and health (alerts) - 1 CPU core per 1 million metrics/s for re-streaming (pushing metrics to another parent) Of course, the numbers depend on the CPU and its clock, but they shouldn't vary significantly on modern systems. At the same time, we work to integrate **Gorilla compression** to our database. This will provide a significantly better overall memory footprint for Netdata. ## Acknowledgments <a id="v1430-acknowledgments"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @MAH69IK for improving ntfy notification title. - @chpfm for fixing slave/user metrics collection stopping when query times out in go.d/mysql. - @k0ste for various installation improvements on CentOS-Stream. - @kylemanna for fixing an issue where a properly functioning sensors was skipped due to limits in python.d/sensors. - @miversen33 for adding access control configuration to ntfy notification method. - @novotnyJiri for fixing the wrong path in ansible-playbook deployment guide. - @theggs for adding installation description for Homebrew on Apple Silicon. - @vpnable for fixing counting UNDEF as users in go.d/openvpn_status_log. - @zhqu1148980644 for fixing docker-compose example. - @luisj1983 for implementing molecule tests in the `netdata/ansible` playbook ## Contributions <a id="v1430-contributions"></a> ### Collectors <a id="v1430-contributions-collectors"></a> <details> <summary>All changes</summary> #### Improvements - Improve exposing metrics by creating a chart for each app group/user/user group (apps.plugin) ([#16095](https://github.com/netdata/netdata/pull/16095), [@thiagoftsm](https://github.com/thiagoftsm)) - Add env NETDATA_LOG_SEVERITY_LEVEL support to external collectors ([#16089](https://github.com/netdata/netdata/pull/16089), [@ilyam8](https://github.com/ilyam8)) - Add env NETDATA_LOG_SEVERITY_LEVEL support (charts.d.plugin) ([#16085](https://github.com/netdata/netdata/pull/16085), [@ilyam8](https://github.com/ilyam8)) - Add env NETDATA_LOG_SEVERITY_LEVEL support (python.d.plugin) ([#16084](https://github.com/netdata/netdata/pull/16084), [@ilyam8](https://github.com/ilyam8)) - Improve performance by reading files sequentially (systemd-journal.plugin) ([#16038](https://github.com/netdata/netdata/pull/16038), [@ktsaou](https://github.com/ktsaou)) - Add systemd-journal plugin to apps_groups.conf (apps.plugin) ([#16024](https://github.com/netdata/netdata/pull/16024), [@ilyam8](https://github.com/ilyam8)) - Improve exposing metrics by creating a chart for each systemd service (cgroups.plugin) ([#15975](https://github.com/netdata/netdata/pull/15975), [@thiagoftsm](https://github.com/thiagoftsm)) - Add disk labels (proc/diskstats) ([#15949](https://github.com/netdata/netdata/pull/15949), [@ktsaou](https://github.com/ktsaou)) - Add support for opening journal files when running inside a container (systemd-journal.plugin) ([#15830](https://github.com/netdata/netdata/pull/15830), [@ktsaou](https://github.com/ktsaou)) - Add env NETDATA_LOG_SEVERITY_LEVEL support (go.d.plugin) ([#1351](https://github.com/netdata/go.d.plugin/pull/1351), [@ilyam8](https://github.com/ilyam8)) - Add "network" config option that allows configuration of DNS resolution (go.d/ping) ([#1348](https://github.com/netdata/go.d.plugin/pull/1348), [@ilyam8](https://github.com/ilyam8)) - Add "custom_numeric_fields" config option (go.d/web_log) ([#1343](https://github.com/netdata/go.d.plugin/pull/1343), [@ilyam8](https://github.com/ilyam8)) - Add upsd (NUT) collector (go.d/upsd) ([#1341](https://github.com/netdata/go.d.plugin/pull/1341), [@ilyam8](https://github.com/ilyam8)) - Improve status chart by making it a dimension per status (go.d/vcsa) ([#1332](https://github.com/netdata/go.d.plugin/pull/1332), [@ilyam8](https://github.com/ilyam8)) - Add label to vm/host charts (go.d/vsphere) ([#1331](https://github.com/netdata/go.d.plugin/pull/1331), [@ilyam8](https://github.com/ilyam8)) #### Bug fixes - Fix 1-second latency in play mode (systemd-journal.plugin) ([#16123](https://github.com/netdata/netdata/pull/16123), [@ktsaou](https://github.com/ktsaou)) - Fix an issue where ipv4 metrics were exposed as ip (proc/netstat) ([#16122](https://github.com/netdata/netdata/pull/16122), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where OOMKill was created unconditionally (ebpf.plugin) ([#16115](https://github.com/netdata/netdata/pull/16115), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix an issue where ebpf threads did not respect the enable/disable value in the configuration (ebpf.plugin) ([#16083](https://github.com/netdata/netdata/pull/16083), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix using undefined var when loading job statuses (python.d.plugin) ([#15965](https://github.com/netdata/netdata/pull/15965), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where a properly functioning sensor was skipped due to limits (python.d/sensors) ([#15905](https://github.com/netdata/netdata/pull/15905), [@kylemanna](https://github.com/kylemanna)) - Fix slave/user metrics collection stopping when query times out (go.d/mysql) ([#1346](https://github.com/netdata/go.d.plugin/pull/1346), [@chpfm](https://github.com/chpfm)) - Fix counting UNDEF as users (go.d/openvpn_status_log) ([#1334](https://github.com/netdata/go.d.plugin/pull/1334), [@vpnable](https://github.com/vpnable)) - Fix an issue where power metric were not collected due to renaming (go.d/nvidia_smi) ([#1310](https://github.com/netdata/go.d.plugin/pull/1310), [@ilyam8](https://github.com/ilyam8)) #### Other - Remove mem_private chart on FreeBSD (apps.plugin) ([#16166](https://github.com/netdata/netdata/pull/16166), [@ilyam8](https://github.com/ilyam8)) - Improve eBPF exit (ebpf.plugin) ([#16159](https://github.com/netdata/netdata/pull/16159), [@thiagoftsm](https://github.com/thiagoftsm)) - Update eBPF to use event loop for functions (ebpf.plugin) ([#16004](https://github.com/netdata/netdata/pull/16004), [@thiagoftsm](https://github.com/thiagoftsm)) - Add socket function (ebpf.plugin) ([#15850](https://github.com/netdata/netdata/pull/15850), [@thiagoftsm](https://github.com/thiagoftsm)) - Update restart time to 1 day (nfacct.plugin) ([#15801](https://github.com/netdata/netdata/pull/15801), [@ilyam8](https://github.com/ilyam8)) </details> ### Packaging / Installation <a id="v1430-contributions-packaging"></a> <details> <summary>All changes</summary> - Fix removing wrong directories when uninstalling on FreeBSD ([#16167](https://github.com/netdata/netdata/pull/16167), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix repo path for openSUSE 15.5 packages ([#16161](https://github.com/netdata/netdata/pull/16161), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix an issue running a Docker container when the default user was configured as a non-root user ([#16156](https://github.com/netdata/netdata/pull/16156), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where the uninstaller script doesn't clean up properly ([#16148](https://github.com/netdata/netdata/pull/16148), [@ilyam8](https://github.com/ilyam8)) - Fix problem with the uninstaller script when executed as a regular user ([#16146](https://github.com/netdata/netdata/pull/16146), [@ilyam8](https://github.com/ilyam8)) - Skip trying to preserve file owners when bundling external code ([#15966](https://github.com/netdata/netdata/pull/15966), [@Ferroin](https://github.com/Ferroin)) - Cleanup Dockerfile ([#15902](https://github.com/netdata/netdata/pull/15902), [@Ferroin](https://github.com/Ferroin)) - Skip copying environment/install-type files when checking existing installations ([#15876](https://github.com/netdata/netdata/pull/15876), [@Ferroin](https://github.com/Ferroin)) - Add setuid fallback for perf and slabinfo plugins in the installer script ([#15807](https://github.com/netdata/netdata/pull/15807), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where cleanup was not performed during the kickstart.sh dry run ([#15775](https://github.com/netdata/netdata/pull/15775), [@ilyam8](https://github.com/ilyam8)) - Add CentOS-Stream to distros ([#15742](https://github.com/netdata/netdata/pull/15742), [@k0ste](https://github.com/k0ste)) - Fix build with --disable-https ([#15395](https://github.com/netdata/netdata/pull/15395), [@MrZammler](https://github.com/MrZammler)) - Enable building go.d plugin natively for CentOS-Stream ([#14551](https://github.com/netdata/netdata/pull/14551), [@k0ste](https://github.com/k0ste)) </details> ### Documentation <a id="v1430-contributions-documentation"></a> <details> <summary>All changes</summary> - Cleanup systemd-journal readme ([#16096](https://github.com/netdata/netdata/pull/16096), [@Ancairon](https://github.com/Ancairon)) - Change Twitter username to @netdatahq ([#16082](https://github.com/netdata/netdata/pull/16082), [@ralphm](https://github.com/ralphm)) - Add "maintained by netdata/community" badge to collectors ([#16073](https://github.com/netdata/netdata/pull/16073), [@Ancairon](https://github.com/Ancairon)) - Improve tc plugin description ([#16068](https://github.com/netdata/netdata/pull/16068), [@thiagoftsm](https://github.com/thiagoftsm)) - Add doc about running a local dashboard through Cloudflare ([#16043](https://github.com/netdata/netdata/pull/16043), [@Ancairon](https://github.com/Ancairon)) - Merge docs about Netdata Charts ([#16042](https://github.com/netdata/netdata/pull/16042), [@Ancairon](https://github.com/Ancairon)) - Add installation description for Homebrew on Apple Silicon ([#16027](https://github.com/netdata/netdata/pull/16027), [@theggs](https://github.com/theggs)) - Add a script to create COLLECTORS.md from the integrations.js file and generate the document ([#15995](https://github.com/netdata/netdata/pull/15995), [@Ancairon](https://github.com/Ancairon)) - Clarify shipping repositories cases ([#15960](https://github.com/netdata/netdata/pull/15960), [@tkatsoulas](https://github.com/tkatsoulas)) - Clarify possible installation types ([#15958](https://github.com/netdata/netdata/pull/15958), [@tkatsoulas](https://github.com/tkatsoulas)) - Add specific info on how to access the dashboards ([#15925](https://github.com/netdata/netdata/pull/15925), [@hugovalente-pm](https://github.com/hugovalente-pm)) - Cleanup "Change how long Netdata stores metrics" ([#15896](https://github.com/netdata/netdata/pull/15896), [@Ancairon](https://github.com/Ancairon)) - Update pfSense doc header ([#15894](https://github.com/netdata/netdata/pull/15894), [@Ancairon](https://github.com/Ancairon)) - Properly document issues with installing on IPv6-only hosts. ([#15882](https://github.com/netdata/netdata/pull/15882), [@Ferroin](https://github.com/Ferroin)) - Add new `delete old models param` to ML readme ([#15873](https://github.com/netdata/netdata/pull/15873), [@andrewm4894](https://github.com/andrewm4894)) - Rename alarm to alert ([#15812](https://github.com/netdata/netdata/pull/15812), [@ilyam8](https://github.com/ilyam8)) - Fix typo in Readme ([#15794](https://github.com/netdata/netdata/pull/15794), [@shyamvalsan](https://github.com/shyamvalsan)) - Fix the wrong path in ansible-playbook deployment guide ([#15786](https://github.com/netdata/netdata/pull/15786), [@novotnyJiri](https://github.com/novotnyJiri)) - Fix docker-compose example ([#15784](https://github.com/netdata/netdata/pull/15784), [@zhqu1148980644](https://github.com/zhqu1148980644)) - Mark integrations milestones as completed in README.md ([#15783](https://github.com/netdata/netdata/pull/15783), [@tkatsoulas](https://github.com/tkatsoulas)) - Add dev documentation for Dynamic Configuration ([#15643](https://github.com/netdata/netdata/pull/15643), [@underhood](https://github.com/underhood)) </details> ### Health <a id="v1430-contributions-health"></a> <details> <summary>All changes</summary> - Add "summary" field to alert configurations ([#16129](https://github.com/netdata/netdata/pull/16129), [@MrZammler](https://github.com/MrZammler)) - Remove discontinued Hangouts and StackPulse notification methods ([#16041](https://github.com/netdata/netdata/pull/16041), [@Ancairon](https://github.com/Ancairon)) - Add go.d/upsd alerts ([#16036](https://github.com/netdata/netdata/pull/16036), [@ilyam8](https://github.com/ilyam8)) - Improve the accuracy of health queries ([#16032](https://github.com/netdata/netdata/pull/16032), [@MrZammler](https://github.com/MrZammler)) - Remove "family" field from alerts ([#16025](https://github.com/netdata/netdata/pull/16025), [@MrZammler](https://github.com/MrZammler)) - Add access control configuration to ntfy notification method ([#15932](https://github.com/netdata/netdata/pull/15932), [@miversen33](https://github.com/miversen33)) - Improve ntfy notification title ([#15909](https://github.com/netdata/netdata/pull/15909), [@MAH69IK](https://github.com/MAH69IK)) - Add "summary" field to alerts ([#15886](https://github.com/netdata/netdata/pull/15886), [@MrZammler](https://github.com/MrZammler)) - Enable "ml_1min_node_ar" alert by default ([#14687](https://github.com/netdata/netdata/pull/14687), [@andrewm4894](https://github.com/andrewm4894)) </details> ### Other Notable Changes <a id="v1430-contributions-other"></a> <details> <summary>All changes</summary> #### Improvements - Improve ML database size management ([#16046](https://github.com/netdata/netdata/pull/16046), [@stelfrag](https://github.com/stelfrag)) - Add the ability for facets to generate histograms ([#15846](https://github.com/netdata/netdata/pull/15846), [@ktsaou](https://github.com/ktsaou)) - Add /api/v2/ilove.svg endpoint ([#15815](https://github.com/netdata/netdata/pull/15815), [@ktsaou](https://github.com/ktsaou)) - Reduce label memory usage ([#15255](https://github.com/netdata/netdata/pull/15255), [@stelfrag](https://github.com/stelfrag)) - Add severity level for logs ([#14727](https://github.com/netdata/netdata/pull/14727), [@thiagoftsm](https://github.com/thiagoftsm)) - Introduce molecule tests for the `netdata/ansible` ([netdata/ansible#6](https://github.com/netdata/ansible/pull/6), [@luisj1983](https://github.com/luisj1983) #### Bug Fixes - Fix MQTT crash when running 32-bit static build on 64-bit system on ARM ([#16154](https://github.com/netdata/netdata/pull/16154), [@underhood](https://github.com/underhood)) - Fix random crashes on pthread_detach() ([#16137](https://github.com/netdata/netdata/pull/16137), [@ktsaou](https://github.com/ktsaou)) - Fix crash on parsing clabel command with no source ([#16114](https://github.com/netdata/netdata/pull/16114), [@ilyam8](https://github.com/ilyam8)) - Fix crash on startup on busy parents ([#16016](https://github.com/netdata/netdata/pull/16016), [@ilyam8](https://github.com/ilyam8)) - Fix duplicate keys in labels ([#16014](https://github.com/netdata/netdata/pull/16014), [@stelfrag](https://github.com/stelfrag)) - Fix obsolete charts cleanup ([#15892](https://github.com/netdata/netdata/pull/15892), [@MrZammler](https://github.com/MrZammler)) #### Other - Fix corrupting the index when doubling the hashtable in facets ([#16171](https://github.com/netdata/netdata/pull/16171), [@ktsaou](https://github.com/ktsaou)) - Fix compilation warnings ([#16158](https://github.com/netdata/netdata/pull/16158), [@stelfrag](https://github.com/stelfrag)) - Don't queue removed when there is a newer alert ([#16157](https://github.com/netdata/netdata/pull/16157), [@MrZammler](https://github.com/MrZammler)) - Batch ML model load commands ([#16155](https://github.com/netdata/netdata/pull/16155), [@stelfrag](https://github.com/stelfrag)) - Make io charts "write" negative in apps and cgroups ([#16152](https://github.com/netdata/netdata/pull/16152), [@ilyam8](https://github.com/ilyam8)) - Varius improvements in system-journal plugin facets ([#16150](https://github.com/netdata/netdata/pull/16150), [@ktsaou](https://github.com/ktsaou)) - Fix logging an unknown key error for "families" in health ([#16145](https://github.com/netdata/netdata/pull/16145), [@ilyam8](https://github.com/ilyam8)) - Fix journal help and mark debug keys in the output ([#16133](https://github.com/netdata/netdata/pull/16133), [@ktsaou](https://github.com/ktsaou)) - Fix an issue where newlines were removed when forwarding FUNCTION_PAYLOAD ([#16120](https://github.com/netdata/netdata/pull/16120), [@underhood](https://github.com/underhood)) - Fix an issue in systemd-journal plugin where anchor were not respected on non-data-only queries ([#16109](https://github.com/netdata/netdata/pull/16109), [@ktsaou](https://github.com/ktsaou)) - Improve systemd-journal plugin histogram and facets calculation ([#16107](https://github.com/netdata/netdata/pull/16107), [@ktsaou](https://github.com/ktsaou)) - Various code improvements ([#16104](https://github.com/netdata/netdata/pull/16104), [@stelfrag](https://github.com/stelfrag)) - Improve systemd-journal plugin logging ([#16101](https://github.com/netdata/netdata/pull/16101), [@ktsaou](https://github.com/ktsaou)) - Improve systemd-journal plugin performance ([#16099](https://github.com/netdata/netdata/pull/16099), [@ktsaou](https://github.com/ktsaou)) - Fix incremental queries in systemd-journal plugin ([#16098](https://github.com/netdata/netdata/pull/16098), [@ktsaou](https://github.com/ktsaou)) - Fix querying out of retention ([#16094](https://github.com/netdata/netdata/pull/16094), [@ktsaou](https://github.com/ktsaou)) - Fix an issue where buggy sd_journal_open_files_fd() were used on systems with old libsystemd ([#16090](https://github.com/netdata/netdata/pull/16090), [@ktsaou](https://github.com/ktsaou)) - Fix a busy wait loop in functions ([#16086](https://github.com/netdata/netdata/pull/16086), [@ktsaou](https://github.com/ktsaou)) - Skip database migration steps in new installation ([#16071](https://github.com/netdata/netdata/pull/16071), [@stelfrag](https://github.com/stelfrag)) - Fix coverity 402975 ([#16058](https://github.com/netdata/netdata/pull/16058), [@stelfrag](https://github.com/stelfrag)) - Send alerts summary field to cloud ([#16056](https://github.com/netdata/netdata/pull/16056), [@MrZammler](https://github.com/MrZammler)) - Fix summary field in table ([#16050](https://github.com/netdata/netdata/pull/16050), [@MrZammler](https://github.com/MrZammler)) - Fix overflow in storage engine stats on 32bit systems ([#16048](https://github.com/netdata/netdata/pull/16048), [@stelfrag](https://github.com/stelfrag)) - Fix wrong units in the `anomaly_detection.detector_events` chart ([#16028](https://github.com/netdata/netdata/pull/16028), [@andrewm4894](https://github.com/andrewm4894)) - Fix crash in systemd-journal plugin when the uid or gid have no names ([#16015](https://github.com/netdata/netdata/pull/16015), [@ktsaou](https://github.com/ktsaou)) - Remove the line length limit from pluginsd ([#16013](https://github.com/netdata/netdata/pull/16013), [@ktsaou](https://github.com/ktsaou)) - Fix compilation warnings ([#16006](https://github.com/netdata/netdata/pull/16006), [@stelfrag](https://github.com/stelfrag)) - Add missing files to CMakeLists.txt ([#16005](https://github.com/netdata/netdata/pull/16005), [@stelfrag](https://github.com/stelfrag)) - Fix compilation warnings ([#16001](https://github.com/netdata/netdata/pull/16001), [@ktsaou](https://github.com/ktsaou)) - Add collectors restart support to functions ([#15983](https://github.com/netdata/netdata/pull/15983), [@ktsaou](https://github.com/ktsaou)) - Maintain node's last connected timestamp in the db ([#15979](https://github.com/netdata/netdata/pull/15979), [@stelfrag](https://github.com/stelfrag)) - Fix race in apps plugin Processes function ([#15978](https://github.com/netdata/netdata/pull/15978), [@ktsaou](https://github.com/ktsaou)) - Implement functions cancelling ([#15977](https://github.com/netdata/netdata/pull/15977), [@ktsaou](https://github.com/ktsaou)) - Various improvements to facets ([#15976](https://github.com/netdata/netdata/pull/15976), [@ktsaou](https://github.com/ktsaou)) - Remove sending db retention in facets ([#15974](https://github.com/netdata/netdata/pull/15974), [@ktsaou](https://github.com/ktsaou)) - Extend ml default training from ~24 to ~48 hours ([#15971](https://github.com/netdata/netdata/pull/15971), [@andrewm4894](https://github.com/andrewm4894)) - Fix wrong facets when histogram is empty ([#15970](https://github.com/netdata/netdata/pull/15970), [@ktsaou](https://github.com/ktsaou)) - Fix shadowing local variable in facets ([#15968](https://github.com/netdata/netdata/pull/15968), [@ktsaou](https://github.com/ktsaou)) - Implement data-only queries in facets ([#15961](https://github.com/netdata/netdata/pull/15961), [@ktsaou](https://github.com/ktsaou)) - Fix direction parsing in systemd-journal plugin ([#15957](https://github.com/netdata/netdata/pull/15957), [@ktsaou](https://github.com/ktsaou)) - Various improvements in facets ([#15956](https://github.com/netdata/netdata/pull/15956), [@ktsaou](https://github.com/ktsaou)) - Fix CID 400366 ([#15953](https://github.com/netdata/netdata/pull/15953), [@stelfrag](https://github.com/stelfrag)) - Improve streaming logs ([#15948](https://github.com/netdata/netdata/pull/15948), [@ktsaou](https://github.com/ktsaou)) - Improve facets performance ([#15940](https://github.com/netdata/netdata/pull/15940), [@ktsaou](https://github.com/ktsaou)) - Improve facets info ([#15936](https://github.com/netdata/netdata/pull/15936), [@ktsaou](https://github.com/ktsaou)) - Add info and source facets to systemd-journal plugin ([#15928](https://github.com/netdata/netdata/pull/15928), [@ktsaou](https://github.com/ktsaou)) - Various improvements in systemd-journal and facets ([#15926](https://github.com/netdata/netdata/pull/15926), [@ktsaou](https://github.com/ktsaou)) - Reduce workload during metadata cleanup ([#15919](https://github.com/netdata/netdata/pull/15919), [@stelfrag](https://github.com/stelfrag)) - Improve shutdown of the metadata thread ([#15901](https://github.com/netdata/netdata/pull/15901), [@stelfrag](https://github.com/stelfrag)) - Make `anomaly_detection.type_anomaly_rate` stacked ([#15895](https://github.com/netdata/netdata/pull/15895), [@andrewm4894](https://github.com/andrewm4894)) - Add better recovery for corrupted metadata ([#15891](https://github.com/netdata/netdata/pull/15891), [@stelfrag](https://github.com/stelfrag)) - Add index to ACLK table to improve update statements ([#15890](https://github.com/netdata/netdata/pull/15890), [@stelfrag](https://github.com/stelfrag)) - Limit atomic operations for statistics ([#15887](https://github.com/netdata/netdata/pull/15887), [@ktsaou](https://github.com/ktsaou)) - Allow any field to be a facet ([#15880](https://github.com/netdata/netdata/pull/15880), [@ktsaou](https://github.com/ktsaou)) - Improve hashing performance in facets by switching to the newer XXH3 128bits algorithm ([#15878](https://github.com/netdata/netdata/pull/15878), [@ktsaou](https://github.com/ktsaou)) - Update SQLITE version to 3.42.0 ([#15870](https://github.com/netdata/netdata/pull/15870), [@stelfrag](https://github.com/stelfrag)) - Add initialization fail reason to analytics ([#15866](https://github.com/netdata/netdata/pull/15866), [@stelfrag](https://github.com/stelfrag)) - Add a chart that groups anomaly rate by chart type ([#15856](https://github.com/netdata/netdata/pull/15856), [@vkalintiris](https://github.com/vkalintiris)) - Fix "unrecognized options: --with-zlib" configure warning ([#15840](https://github.com/netdata/netdata/pull/15840), [@stelfrag](https://github.com/stelfrag)) - Fix compilation warning ([#15839](https://github.com/netdata/netdata/pull/15839), [@stelfrag](https://github.com/stelfrag)) - Fix warning when compiling with -flto ([#15838](https://github.com/netdata/netdata/pull/15838), [@stelfrag](https://github.com/stelfrag)) - Fix opening journal files on systems with old systemd ([#15837](https://github.com/netdata/netdata/pull/15837), [@ktsaou](https://github.com/ktsaou)) - Add ilove.html ([#15836](https://github.com/netdata/netdata/pull/15836), [@ktsaou](https://github.com/ktsaou)) - Fix CID 382964: Code maintainability issues (SIZEOF_MISMATCH) ([#15833](https://github.com/netdata/netdata/pull/15833), [@stelfrag](https://github.com/stelfrag)) - Fix coverity 393052: API usage errors (LOCK) ([#15832](https://github.com/netdata/netdata/pull/15832), [@stelfrag](https://github.com/stelfrag)) - Fix an issue where fd wasn't released if setsockopt or bind fails ([#15826](https://github.com/netdata/netdata/pull/15826), [@stelfrag](https://github.com/stelfrag)) - Fix memory leak when updating job status if job does not exist in Dyncfg. ([#15822](https://github.com/netdata/netdata/pull/15822), [@stelfrag](https://github.com/stelfrag)) - Improve ML initialization ([#15819](https://github.com/netdata/netdata/pull/15819), [@stelfrag](https://github.com/stelfrag)) - Update cmakelist ([#15817](https://github.com/netdata/netdata/pull/15817), [@stelfrag](https://github.com/stelfrag)) - Add streaming support to Dyncfg ([#15791](https://github.com/netdata/netdata/pull/15791), [@underhood](https://github.com/underhood)) - Various fixes in Dyncfg ([#15785](https://github.com/netdata/netdata/pull/15785), [@underhood](https://github.com/underhood)) - Improve ML models cleanup ([#15720](https://github.com/netdata/netdata/pull/15720), [@vkalintiris](https://github.com/vkalintiris)) - Cleanup and refactoring of health and ACLK code ([#15665](https://github.com/netdata/netdata/pull/15665), [@stelfrag](https://github.com/stelfrag)) - Improve metadata cleanup ([#15462](https://github.com/netdata/netdata/pull/15462), [@stelfrag](https://github.com/stelfrag)) </details> ## Deprecation notice <a id="v1430-deprecation-notice"></a> ### Changed in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.42.0#v1420-deprecation-notice), the following items in this release have been changed: | Component | Type | Change | Action | |--------------------------------------------------------------------------------------------------------------|------------------------------------|--------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------:| | [apps.plugin](https://github.com/netdata/netdata/tree/master/collectors/apps.plugin) | collector | a dimension for each group/user/user group => a chart for each group/user/user group | | | [cgroups.plugin](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin) | collector | a dimension for each systemd service => a chart for each systemd service | | | [proc.plugin](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin) | collector | all "Networking Stack" metrics except "tcp" have been moved to "IPv4 Networking" | | | `family` attribute | alert configuration and Health API | deprecated | use [chart labels](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-chart-labels) | ### Will be changed in the next release We plan to change in the next release (v1.44.0): | Component | Type | Change | Action | |-----------------------------------------------------------------------------------------------------------------|-----------|----------------------------------------------------|:----------------------------------------------------------------------------------------:| | [charts.d/nut](https://github.com/netdata/netdata/tree/v1.42.4/collectors/charts.d.plugin/nut#upspdu-collector) | collector | deprecated | use [go.d/upsd](https://github.com/netdata/go.d.plugin/tree/master/modules/upsd#ups-nut) | ## Netdata Release Meetup <a id="v1430-netdata-release-meetup"></a> Join the Netdata team on the **18th of October at 16:30 UTC** for the [Netdata Release Meetup](https://www.meetup.com/netdata/events/296577240/). Together we’ll cover: - Release Highlights. - Acknowledgments. - Q&A with the community. [RSVP now](https://www.meetup.com/netdata/events/296577240/) - we look forward to meeting you. ## Support options <a id="v1430-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-10-16T21:00:34+00:00 netdata v1.43.1 netdata v1.43.1 2023-10-26T14:17:28+00:00 Netdata v1.43.1 is a patch release to address issues discovered since [v1.43.0](https://github.com/netdata/netdata/releases/tag/v1.43.0). This patch release provides the following bug fixes and updates: - Prevent wrong optimization armv7l static build ([#16274](https://github.com/netdata/netdata/pull/16274), [@stelfrag](https://github.com/stelfrag)). - Fixed pattern matching in Functions Search ([#16264](https://github.com/netdata/netdata/pull/16264), [@ktsaou](https://github.com/ktsaou)). - Fixed an issue where the query planner was using the wrong dbengine tier that had no data for the selected time period ([#16263](https://github.com/netdata/netdata/pull/16263), [@ktsaou](https://github.com/ktsaou)). - Fixed invalid payload in Discord notifications ([#16257](https://github.com/netdata/netdata/pull/16257), [@luchaos](https://github.com/luchaos)). - Fixed possible deadlock on discovery thread shutdown in cgroups plugin ([#16246](https://github.com/netdata/netdata/pull/16246), [@stelfrag](https://github.com/stelfrag)). - Fixed duplicate chart labels ([#16249](https://github.com/netdata/netdata/pull/16249), [@stelfrag](https://github.com/stelfrag)). - Fixed dimension HETEROGENEOUS check ([#16234](https://github.com/netdata/netdata/pull/16234), [@stelfrag](https://github.com/stelfrag)). - Updated go.d plugin version to v0.56.3 ([#16228](https://github.com/netdata/netdata/pull/16228), [@ilyam8](https://github.com/ilyam8)). - Fixed calculation of dbengine statistics on 32bit systems ([#16222](https://github.com/netdata/netdata/pull/16222), [@stelfrag](https://github.com/stelfrag)). - Improved handling of duplicate labels ([#16172](https://github.com/netdata/netdata/pull/16172), [@stelfrag](https://github.com/stelfrag)). - Improved cleanup on shutdown of collectors ([#16023](https://github.com/netdata/netdata/pull/16023), [@ktsaou](https://github.com/ktsaou)) ## Acknowledgements <a id="v1431-acknowledgements"></a> We would like to thank our dedicated, talented contributors that make up this amazing community. The time and expertise that you volunteer are essential to our success. We thank you and look forward to continuing to grow together to build a remarkable product. - @luchaos for fixing Discord notifications. ## Support options <a id="v1431-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-10-26T14:17:28+00:00 netdata v1.43.2 netdata v1.43.2 2023-10-30T15:49:50+00:00 Netdata v1.43.2 is a patch release to address issues discovered since [v1.43.1](https://github.com/netdata/netdata/releases/tag/v1.43.1). This patch release provides the following bug fixes and updates: - Fix rrdlabels type ([1676de2](https://github.com/netdata/netdata/commit/1676de2413bc7bf4dfcf839e369ee660cd303703), [@stelfrag](https://github.com/stelfrag)). - Fix label copy to allow new keys with different values ([6179213](https://github.com/netdata/netdata/commit/61792132167d6f79aa7f876e66e7a28cac478fb3), [@stelfrag](https://github.com/stelfrag)). - Fix internal label source propagation when streaming metrics ([60cd86d](https://github.com/netdata/netdata/commit/60cd86dac50678805efead490aa4147bc11881b9), [@ktsaou](https://github.com/ktsaou)). - Speed up queries when sending alerts to Cloud on parents with a large number of alerts per child ([f80f0fc](https://github.com/netdata/netdata/commit/f80f0fc8ddaae0015b968d7492acaa7fda47ee4c), [@MrZammler](https://github.com/MrZammler)). - Fix filtering when selecting multiple fields in systemd-journal plugin ([750ca8e](https://github.com/netdata/netdata/commit/750ca8ef5393aee927cc59c2c910f45c7c14b703), [@stelfrag](https://github.com/stelfrag)). - Fix an issue where parents were missing chart labels of child instances ([240f9e7](https://github.com/netdata/netdata/commit/240f9e71512e9891b44aaf3fcd2db5e3185e7791), [@ktsaou](https://github.com/ktsaou)). - Fix an issue where updated labels were not propagated to parents ([644d432](https://github.com/netdata/netdata/commit/644d432867187a3dbdc055fbad8a06e038ffb520), [@stelfrag](https://github.com/stelfrag)). ## Support options <a id="v1432-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-10-30T15:49:50+00:00 netdata v1.44.0 netdata v1.44.0 2023-12-06T18:15:03+00:00 ## Table of Contents - [Netdata Growth ](#v1440-netdata-open-source-growth) - [Release Summary](#v1440-release-summary) - [Release Highlights](#v1440-release-highlights) - [Netdata vs. Prometheus](#v1440-showdown-netdata-prometheus) - [systemd-journal plugin improvements](#v1440-systemd-journal-logs) - [log2journal, a new tool in your quiver for log management](#v1440-util-log2journal) - [Netdata now logs to systemd-journal](#v1440-new-logging-mechanism) - [Functions, power up your troubleshooting toolkit!](#v1440-new-functions) - [New Alert Notification Integrations to Netdata Cloud](#v1440-new-alert-notification-integrations) - [Acknowledgments](#v1440-acknowledgments) - [Contributions](#v1440-contributions) - [Collectors](#v1440-contributions-collectors) - [Packaging/Installation](#v1440-contributions-packaging) - [Documentation ](#v1440-contributions-documentation) - [Other Notable Changes](#v1440-contributions-other) - [Deprecation notice](#v1440-deprecation-notice) - [Netdata Release Meetup](#v1440-netdata-release-meetup) - [Support options](#v1440-support-options) Steady to our schedule, this is another great Netdata release! > [!IMPORTANT] > Stay informed about upcoming changes and potential deprecations by reviewing the [deprecation notice](#v1440-deprecation-notice) sections. This will help you plan for any necessary adjustments to ensure a smooth transition. ## Netdata Growth <a id="v1440-netdata-open-source-growth"></a> - **66k+ GitHub Stars ⭐** Since October 2023, Netdata is leading the observability category in the [CNCF landscape](https://landscape.cncf.io/card-mode?category=observability-and-analysis&grouping=no&sort=stars), surpassing Elasticsearch. Thank you for your love ❤️! [Give Netdata a ⭐ too, on GitHub!](https://github.com/netdata/netdata) - **600M+ docker hub pulls** Netdata runs with about 200k docker hub downloads per day. Since June 2023 we are a [Verified Publisher](https://hub.docker.com/r/netdata/netdata), so that Netdata pulls don't count against docker hub pull limits for our users, allowing all our users to integrate Netdata to their CI/CD toolchains. ## Release Summary <a id="v1440-release-summary"></a> - **Netdata beats Prometheus in all aspects**: this version of Netdata includes significant improvement allowing Netdata to be a lot more performant than Prometheus, at scale. Full performance analysis included. - **Netdata Journal Logs**: Netdata can now deal with huge systemd-journal databases and is available for the host logs when Netdata runs in a container. - **First beta version of Netdata's `log2journal`**: a utility to extract, convert, transform and send to systemd-journal any kind of structured logs (including JSON and logfmt logs), similar to what `promtail` does for Loki. - **More Netdata Functions**: monitor containers and VMs, network interfaces, mount points, block devices, systemd units, systemd services, and more! - **Netdata now logs to journal** instead of log files and the results are amazing! ## Release Highlights <a id="v1440-release-highlights"></a> ### Netdata beats Prometheus in all aspects <a id="v1440-showdown-netdata-prometheus"></a> ![image](https://github.com/netdata/netdata/assets/82235632/0d208974-74cf-4074-898f-b87146364d29) We tested Netdata and Prometheus at scale, both ingesting 2.7 million metrics per second. On the same workload, Netdata vs Prometheus needs: - 35% less CPU - 49% less RAM - 12% less bandwidth - 75% less disk space - 98% less disk I/O Read the [full performance comparison between Netdata and Prometheus](https://blog.netdata.cloud/netdata-vs-prometheus-performance-analysis/). To achieve these astonishing results, we made the following changes to Netdata since the previous release: #### New `SLOTS` streaming protocol A new streaming protocol, allows Netdata children and parents to share a common index of the metrics streamed, allowing the parents to receive metrics without consulting hashtables, reducing the overall overhead on parents by about 30%, without increasing the overhead on children (the children just number each metric). The new protocol, called `SLOTS`, is automatically selected when both the child and the parent support it. #### Streaming compression algorithms Streaming now supports multiple compression algorithms. Previous Netdata releases supported only LZ4, which is known for its speed and average compression ratio. This release adds support for ZSTD, GZIP, and BROTLI. ZSTD provides the best balance between compression ratio and CPU consumption, and therefore it is now the default. The compression algorithms selection order can be configured on parents, in `stream.conf`, at the `[API]` section (parents), by setting `compression algorithms order = zstd lz4 brotli gzip`. If you need to save most bandwidth at the expense of CPU utilization set this so that `brotli` or `gzip` appear first in the list, before `zstd` and `lz4`. This also means that parents can now have a different compression order for each API key, allowing the use of different API keys based on the location of the child (i.e. children that are on billable egress bandwidth can use an API key that prefers the best compression, like `brotli` and `gzip`, while children on non-billable egress bandwidth can use an API key that prefers the best CPU utilization, like `zstd` or `lz4`). #### Gorilla compression beta <a id="v1440-release-highlights-gorilla"></a> Gorilla compression is a time series data compression technique, developed by Facebook for their time series database, Gorilla. It's particularly efficient for compressing data that changes incrementally over time, which is a common characteristic of time series data. This release of Netdata includes an adaptation of Gorilla compression, which once enabled, provides 30% additional memory reduction to Netdata. This was not ready when we compared Netdata and Prometheus, so the Gorilla compression benefits weren't accounted in the comparison. By enabling Gorilla compression, Netdata memory reduction is 70%+ compared to Prometheus. To try Gorilla compression, edit `netdata.conf` and set at the `[db]` section, `dbengine page type = gorilla`. Keep in mind that enabling Gorilla compression changes the dbegnine file format to Gorilla compressed metrics. This version of Netdata can read Gorilla-compressed data from dbengine even if Gorilla compression is not enabled, but previous versions of Netdata cannot read it. So, enable Gorilla, only if you don't plan to switch back to a previous version of Netdata. Our plan is to have Gorilla compression enabled by default at the next release of Netdata. ### systemd-journal logs <a id="v1440-systemd-journal-logs"></a> Our `systemd-journal.plugin` was already quite faster (10x) than `journalctl`, but still it was slow when the journal databases is huge (e.g. at journals centralization points where hundreds or thousands of nodes push their logs). In this release, we introduce several changes to allow the plugin to work promptly in such environments. #### Sampling and estimations The biggest performance issue with systemd-journal logs is the query performance when dealing with huge logs databases. To overcome this performance issue and provide prompt responses to queries, Netdata now uses the following strategy: 1. The latest 500k log entries read from journal files work like before: we read all of them and all the values for all their fields, so that we can have accurate histograms and counters per field value at the filters. 2. Once we hit the 500k log entries limit on a single query, we turn on **sampling and estimations**. 3. Sampling distributes 500k more log entries to all the journal files to be read, so that the total log entries queried for their field values will be 1M. This means that if we have to read 100 files, 10k log entries per file will be sampled and 10k log entries more will be unsampled. Since files are usually spread over time, this provides a good sample across time. 4. When the sampling threshold is hit, Netdata continues reading more log entries without querying the values of the fields. These log entries appear as `[unsampled]` at the histogram. We know these log entries are there, but the value counters on the field filters do not include them. 5. When the `[unsampled]` threshold is hit, and we have read more than 1% of each file, Netdata estimates the number of entries that will be read from the file and skips the rest of it. This estimation appears as `[estimated]` in the histogram. The above process allows Netdata to provide a histogram of the logs in a timely manner, even when the number of log entries in the visible timeframe is several dozen million. A similar process is usually used by log management systems, including Grafana Loki and Elasticsearch. However, Netdata takes a much bigger sample of the data (other systems usually sample only a few thousand log entries, while Netdata usually samples more than a million) and the visualization allows exposing the exact sampling and estimations made at the histogram. Image showing `[unsampled]` and `[estimated]` on a systemd journal system that collects about 10k nginx log entries per second: ![image](https://github.com/netdata/netdata/assets/2662304/18da8997-db3c-4393-9c85-9f27251d8cc8) Read more about [journals query performance](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/README.md#performance-at-scale). #### journals scan On busy logs centralization servers, the number of journal files available in `/var/log/journal/remote` can grow significantly, slowing down directory listing (even `ls -l` is very slow on them). To overcome this issue, Netdata now uses inotify events and sorts the files to be scanned from the latest to the oldest. These changes allow Netdata to present the logs user interface for the most recent journals, immediately after a Netdata restart, while the journals database is scanned in the background. #### Logs UI is now available when using Netdata docker images We switched Netdata docker images from Alpine Linux to Debian, so that `libsystemd` will be available inside the docker image, allowing `systemd-journal.plugin` to be compiled and shipped with Netdata docker images. Using Netdata docker images, Netdata can now query the host system journal files, while running inside the container. #### MESSAGE_ID support systemd-journal has a nice feature where certain events of common interest are given a specific `MESSAGE_ID`. Several such `MESSAGE_ID`s have been assigned to track common events, like coredumps, units start/stop events, VMs start/stop events, time changes, etc. In total, we found more than 50 total unique events that are tracked this way. This version if `systemd-journal.plugin` automatically tracks and annotates these `MESSAGE_ID`s using their names allowing quick spotting of events of common interest. This feature is available at the `MESSAGE_ID` field filter, at the right side of the dashboard. ### `log2journal`, a new tool on your quiver for managing logs <a id="v1440-util-log2journal"></a> `log2journal` is a new utility allowing the conversion of log files into structured systemd-journal log entries. This is currently in beta. The utility allows processing logs like this: ```bash tail -F /var/log/nginx/access.log |\ log2journal -c nginx-combined |\ systemd-cat-native ``` The above builds a basic pipeline for converting the `access.log` of an Nginx web server into structured log entries in the local systemd-journal. - `tail` is responsible for feeding the latest logs lines to `log2journal`. Multiple files can be specified and `log2journal` can also pick up the filename from `tail` and add it as a field to the journal logs. - `log2journal` extracts fields from the log lines it is fed with. This is a powerful tool that can read `json` and `logfmt` logs, but also extract fields using PCRE2 patterns from any log. It supports filtering, renaming, and rewriting rules using command line arguments or yaml configuration files. The output of `log2journal` is the standard [Journal Export Format](https://systemd.io/JOURNAL_EXPORT_FORMATS/). - `systemd-cat-native` is another new Netdata utility, reading standard Journal Export Format entries, which are then sent to a local or remote systemd-journal system. [Read more here](https://github.com/netdata/netdata/blob/master/collectors/log2journal/README.md). Image showing structured nginx logs into systemd-journal: ![image](https://github.com/netdata/netdata/assets/2662304/16b471ff-c5a1-4fcc-bcd5-83551e089f6c) ### Netdata now logs to systemd-journal <a id="v1440-new-logging-mechanism"></a> The logging layer of Netdata has been rewritten, so that Netdata logs now go to the systemd-journal, in a namespace called `netdata`. The obvious outcome is that now you can monitor Netdata logs, using Netdata's `systemd-journal.plugin` user interface and thanks to journal namespaces, this does not pollute the system logs. But this is just the beginning... Netdata utilizes the `MESSAGE_ID` feature of systemd-journal to register: - all alert transitions - all alert notifications - all connections from Netdata children - all connections to Netdata parents This means that the `systemd-journal.plugin` user interface, and `journalctl` can now be used to list all such events uniformly. Screenshot of Netdata alert transitions in systemd-journals: ![image](https://github.com/netdata/netdata/assets/2662304/a46cfd5f-3e95-4904-b554-e1f662ca37f7) All Netdata logs are now structured. Netdata can also log in `json` or `logfmt` formats. We introduced a lot of new fields to track every aspect of Netdata, in a uniform and consistent way. [Read more here](https://github.com/netdata/netdata/tree/master/libnetdata/log). Furthermore, we introduced a new tool called `systemd-cat-native` allowing any application or shell script to send structured logs to systemd-journal. [Read more here](). ### Functions, power up your troubleshooting toolkit! <a id="v1440-new-functions"></a> Several new **Functions** have been added to help us in our troubleshooting journeys. On top of `processes`, `streaming` and `systemd-journal`, we are leveraging the wide range of collectors and metrics Netdata has and bring data in a different visual representation. The updated list can be found on our documentation [here](https://learn.netdata.cloud/docs/visualizations/functions#what-functions-are-currently-available), and you can find a summary of the currently available functions with the corresponding CLI tool it relates to: | Function | Description | Alternative to CLI tools | plugin - module | |:-------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|:-----------------------------------------------------------------------------------------------------------| | block-devices | Disk I/O activity for all block devices, offering insights into both data transfer volume and operation performance. | `iostat` | [proc](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin#readme) | | containers-vms | Insights into the resource utilization of containers and QEMU virtual machines: CPU usage, memory consumption, disk I/O, and network traffic. | `docker stats`, `systemd-cgtop` | [cgroups](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin#readme) | | ipmi-sensors | Readings and status of IPMI sensors. | `ipmi-sensors` | [freeipmi](https://github.com/netdata/netdata/tree/master/collectors/freeipmi.plugin#readme) | | mount-points | Disk usage for each mount point, including used and available space, both in terms of percentage and actual bytes, as well as used and available inode counts. | `df` | [diskspace](https://github.com/netdata/netdata/tree/master/collectors/diskspace.plugin#readme) | | network interfaces | Network traffic, packet drop rates, interface states, MTU, speed, and duplex mode for all network interfaces. | `bmon`, `bwm-ng` | [proc](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin#readme) | | processes | Real-time information about the system's resource usage, including CPU utilization, memory consumption, and disk IO for every running process. | `top`, `htop` | [apps](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md) | | systemd-journal | Viewing, exploring and analyzing systemd journal logs. | `journalctl` | [systemd-journal](https://github.com/netdata/netdata/tree/master/collectors/systemd-journal.plugin#readme) | | systemd-list-units | Information about all systemd units, including their active state, description, whether or not they are enabled, and more. | `systemctl list-units` | [systemd-journal](https://github.com/netdata/netdata/tree/master/collectors/systemd-journal.plugin#readme) | | systemd-services | System resource utilization for all running systemd services: CPU, memory, and disk IO. | `systemd-cgtop` | [cgroups](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin#readme) | | streaming | Comprehensive overview of all Netdata children instances, offering detailed information about their status, replication completion time, and many more. | | | In the short-term, we will keep adding more (hopefully) helpful Functions but have longer-term plan where we will want to expand this functionality to potentially allow taking and storing snapshots of the results based on: triggered alerts, or periodical configuration. In case you have suggestions we have a running GitHub Discussion open [here](https://github.com/netdata/netdata/discussions/16438). ### New Alert Notification Integrations to Netdata Cloud <a id="v1440-new-alert-notification-integrations"></a> We've been working on adding more Alert Notification Integrations to Netdata Cloud and recently added the following new ones: * Amazon Simple Notification Service (Amazon SNS), and * Telegram ![image](https://github.com/netdata/netdata/assets/82235632/92bb724a-b64d-4eb9-8570-49a5a5fdee7a) The full list of Alert Notification Integrations from Netdata Cloud can be found on our documentation [here](https://learn.netdata.cloud/docs/alerting/notifications/centralized-cloud-notifications). ## Acknowledgments <a id="v1440-acknowledgments"></a> - @ClaraCrazy for improving degraded adapters detection in python.d/megacli. - @thomasbeaudry for adding UPS selftest and status metrics to charts.d/apcupsd. - @watsonbox for adding LBAs written/read metrics to python.d/smartd_log. - @sepek for correcting an error in the "Change how long Netdata stores metrics" guide. - @seniorquico for fixing parsing and adding MAINT status metrics to python.d/haproxy. - @luisj1983 for correcting errors in the Health API documentation. - @andyundso for improving apps plugin by adding Erlang in apps_groups.conf. - @vobruba-martin for adding various improvements to go.d/mysql. ## Contributions <a id="v1440-contributions"></a> ### Collectors <a id="v1440-contributions-collectors"></a> <details open> <summary><b>Improvements</b></summary> - Add more cases for megacli adapter degraded state (python.d/megacli) ([#16522](https://github.com/netdata/netdata/pull/16522), [@ClaraCrazy](https://github.com/ClaraCrazy)) - Improve estimations accuracy (systemd-journal.plugin) ([#16467](https://github.com/netdata/netdata/pull/16467), [@ktsaou](https://github.com/ktsaou)) - Implement estimations (systemd-journal.plugin)([#16445](https://github.com/netdata/netdata/pull/16445), [@ktsaou](https://github.com/ktsaou)) - Improve startup time (systemd-journal.plugin) ([#16443](https://github.com/netdata/netdata/pull/16443), [@ktsaou](https://github.com/ktsaou)) - Implement sampling (systemd-journal.plugin) ([#16433](https://github.com/netdata/netdata/pull/16433), [@ktsaou](https://github.com/ktsaou)) - Add cgroup current pids metric (cgroups.plugin) ([#16369](https://github.com/netdata/netdata/pull/16369), [@ilyam8](https://github.com/ilyam8)) - Add Ipmi-sensors function (freeipmi.plugin) ([#16363](https://github.com/netdata/netdata/pull/16363), [@ilyam8](https://github.com/ilyam8)) - Add UPS status code metric (charts.d/apcupsd) ([#16361](https://github.com/netdata/netdata/pull/16361), [@thomasbeaudry](https://github.com/thomasbeaudry)) - Add Mount-points function (diskspace.plugin) ([#16345](https://github.com/netdata/netdata/pull/16345), [@ilyam8](https://github.com/ilyam8)) - Add Block-devices function (proc/diskstats) ([#16338](https://github.com/netdata/netdata/pull/16338), [@ilyam8](https://github.com/ilyam8)) - Add UsedBy field to Network-interfaces function (proc/proc_net_dev) ([#16337](https://github.com/netdata/netdata/pull/16337), [@ilyam8](https://github.com/ilyam8)) - Add various improvements to Network-interfaces function (proc/proc_net_dev)([#16336](https://github.com/netdata/netdata/pull/16336), [@ilyam8](https://github.com/ilyam8)) - Add Network-interfaces function (proc/proc_net_dev) ([#16334](https://github.com/netdata/netdata/pull/16334), [@ilyam8](https://github.com/ilyam8)) - Add Systemd-list-units function (systemd-journal.plugin) ([#16318](https://github.com/netdata/netdata/pull/16318), [@ktsaou](https://github.com/ktsaou)) - Add Containers-vms function (cgroups.plugin) ([#16314](https://github.com/netdata/netdata/pull/16314), [@ktsaou](https://github.com/ktsaou)) - Add UPS selftest status metric (charts.d/apcupsd) ([#16286](https://github.com/netdata/netdata/pull/16286), [@thomasbeaudry](https://github.com/thomasbeaudry)) - Add a configuration option to set private cleanup timeout (statsd.plugin) ([#16269](https://github.com/netdata/netdata/pull/16269), [@MrZammler](https://github.com/MrZammler)) - Add container_device label to network interfaces (cgroups.plugin) ([#16261](https://github.com/netdata/netdata/pull/16261), [@ilyam8](https://github.com/ilyam8)) - Add selecting multiple sources support (systemd-journal.plugin) ([#16252](https://github.com/netdata/netdata/pull/16252), [@ktsaou](https://github.com/ktsaou)) - Add total LBAs written/read metrics (python.d/smartd_log) ([#16245](https://github.com/netdata/netdata/pull/16245), [@watsonbox](https://github.com/watsonbox)) - Add Erlang to apps_groups.conf (apps.plugin) ([#16231](https://github.com/netdata/netdata/pull/16231), [@andyundso](https://github.com/andyundso)) - Add support for Proxmox vms/containers name resolution in Docker (cgroups.plugin) ([#16193](https://github.com/netdata/netdata/pull/16193), [@ilyam8](https://github.com/ilyam8)) - Add nested JSON support to log parser (go.d/weblog) ([#1416](https://github.com/netdata/go.d.plugin/pull/1416), [@ilyam8](https://github.com/ilyam8)) </details> <details> <summary><b>Bug fixes</b></summary> #### Bug Fixes - Fix configuration loading (charts.d.plugin ) ([#16471](https://github.com/netdata/netdata/pull/16471), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where systemd-journal would stop trying different socket paths after the first failure (systemd-journal.plugin) ([#16458](https://github.com/netdata/netdata/pull/16458), [@ktsaou](https://github.com/ktsaou)) - Fix parsing PD without NCQ status (python.d/adaptec_raid) ([#16400](https://github.com/netdata/netdata/pull/16400), [@ilyam8](https://github.com/ilyam8)) - Fix Systemd-list-units function expiration time ([#16393](https://github.com/netdata/netdata/pull/16393), [@ilyam8](https://github.com/ilyam8)) - Fix lack of system.net when running inside LXC ([#16364](https://github.com/netdata/netdata/pull/16364), [@ilyam8](https://github.com/ilyam8)) - Fix memory leak in Systemd-list-units function (systemd-journal.plugin) ([#16333](https://github.com/netdata/netdata/pull/16333), [@ktsaou](https://github.com/ktsaou)) - Fix server status parsing and add MAINT status chart (python.d/haproxy) ([#16253](https://github.com/netdata/netdata/pull/16253), [@seniorquico](https://github.com/seniorquico)) </details> <details> <summary><b>Other</b></summary> #### Other - Skip timestamp when logging to journald (python.d.plugin) ([#16516](https://github.com/netdata/netdata/pull/16516), [@ilyam8](https://github.com/ilyam8)) - Mute stock jobs logging during check() (python.d.plugin) ([#16515](https://github.com/netdata/netdata/pull/16515), [@ilyam8](https://github.com/ilyam8)) - Improvement performance of the plugin (systemd-journal.plugin) ([#16509](https://github.com/netdata/netdata/pull/16509), [@ktsaou](https://github.com/ktsaou)) - Don't create runtime disk config by default (proc/diskspace, proc/diskstats) ([#16503](https://github.com/netdata/netdata/pull/16503), [@ilyam8](https://github.com/ilyam8)) - Don't create runtime device config by default (proc/proc_net_dev) ([#16501](https://github.com/netdata/netdata/pull/16501), [@ilyam8](https://github.com/ilyam8)) - Disable netdata monitoring section by default ([#16480](https://github.com/netdata/netdata/pull/16480), [@MrZammler](https://github.com/MrZammler)) - Change apps oom and net charts order (ebpf.plugin) ([#16395](https://github.com/netdata/netdata/pull/16395), [@thiagoftsm](https://github.com/thiagoftsm)) - Fix "differ in signedness" warn in cgroups plugin ([#16391](https://github.com/netdata/netdata/pull/16391), [@ilyam8](https://github.com/ilyam8)) - Fix throttle_duration chart context (cgroups.plugin) ([#16367](https://github.com/netdata/netdata/pull/16367), [@ilyam8](https://github.com/ilyam8)) - Hide summary columns in network and block devices functions (proc/diskstats, proc/proc_net_dev) ([#16347](https://github.com/netdata/netdata/pull/16347), [@ktsaou](https://github.com/ktsaou)) - Fix crash when a container has no CPU/mem metrics in Containers-vms function (cgroups.plugin) ([#16331](https://github.com/netdata/netdata/pull/16331), [@ilyam8](https://github.com/ilyam8)) - Add tcp v6 connect calls to Ebpf_socket function (ebpf.plugin) ([#16316](https://github.com/netdata/netdata/pull/16316), [@thiagoftsm](https://github.com/thiagoftsm)) - Update journal sources once per minute (systemd-journal.plugin) ([#16298](https://github.com/netdata/netdata/pull/16298), [@ktsaou](https://github.com/ktsaou)) - Minor updates and cleanup (systemd-journal.plugin) ([#16267](https://github.com/netdata/netdata/pull/16267), [@ktsaou](https://github.com/ktsaou)) - Stop using deprecated distutils module (python.d.plugin) ([#16259](https://github.com/netdata/netdata/pull/16259), [@MrZammler](https://github.com/MrZammler)) - Remove charts.d/nut ([#16230](https://github.com/netdata/netdata/pull/16230), [@ilyam8](https://github.com/ilyam8)) - Don't log an error opening `cgroup.procs`/`tasks` if it does not exist (cgroups.plugin) ([#16196](https://github.com/netdata/netdata/pull/16196), [@ilyam8](https://github.com/ilyam8)) - Improve exposing metrics by creating a chart for each app group (ebpf.plugin) ([#16139](https://github.com/netdata/netdata/pull/16139), [@thiagoftsm](https://github.com/thiagoftsm)) - Skip timestamp when logging to journald (go.d.plugin) ([#1418](https://github.com/netdata/go.d.plugin/pull/1418), [@ilyam8](https://github.com/ilyam8)) - Replace logger with structured logger (go.d.plugin) ([#1418](https://github.com/netdata/go.d.plugin/pull/1418), [@ilyam8](https://github.com/ilyam8)) - Use SHOW REPLICA STATUS for MySQL v8.0.22+ (go.d/mysql) ([#1392](https://github.com/netdata/go.d.plugin/pull/1392), [@vobruba-martin](https://github.com/vobruba-martin)) - Use performance_schema instead of information_schema for MySQL v8.0.22+ (go.d/mysql) ([#1390](https://github.com/netdata/go.d.plugin/pull/1390), [@vobruba-martin](https://github.com/vobruba-martin)) </details> ### Packaging/Installation <a id="v1440-contributions-packaging"></a> <details open> <summary><b>All changes</b></summary> - Add curl example to create_netdata_conf() ([#16498](https://github.com/netdata/netdata/pull/16498), [@ilyam8](https://github.com/ilyam8)) - Fix incorrect DEB package build dependencies ([#16483](https://github.com/netdata/netdata/pull/16483), [@Ferroin](https://github.com/Ferroin)) - Update go.d plugin version to v0.57.1 ([#16465](https://github.com/netdata/netdata/pull/16465), [@ilyam8](https://github.com/ilyam8)) - Add sbindir_POST to the PATH of bash scripts using `systemd-cat-native` ([#16456](https://github.com/netdata/netdata/pull/16456), [@ilyam8](https://github.com/ilyam8)) - Add LogNamespace to Netdata systemd unit ([#16454](https://github.com/netdata/netdata/pull/16454), [@ilyam8](https://github.com/ilyam8)) - Add linking daemon.log to stderr in docker ([#16447](https://github.com/netdata/netdata/pull/16447), [@ilyam8](https://github.com/ilyam8)) - Add support for installing a specific major version of the agent on install ([#16413](https://github.com/netdata/netdata/pull/16413), [@Ferroin](https://github.com/Ferroin)) - Improve handling around EPEL requirement for RPM packages ([#16406](https://github.com/netdata/netdata/pull/16406), [@Ferroin](https://github.com/Ferroin)) - Add zstd-dev to install-required-packages ([#16370](https://github.com/netdata/netdata/pull/16370), [@ilyam8](https://github.com/ilyam8)) - Fix zstd in static build ([#16349](https://github.com/netdata/netdata/pull/16349), [@ilyam8](https://github.com/ilyam8)) - Cleanup systemd unit files After ([#16332](https://github.com/netdata/netdata/pull/16332), [@ilyam8](https://github.com/ilyam8)) - Update openssl to 3.1.4 for static build ([#16303](https://github.com/netdata/netdata/pull/16303), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix not removing /etc/cron.d/netdata-updater-daily during uninstall ([#16233](https://github.com/netdata/netdata/pull/16233), [@ilyam8](https://github.com/ilyam8)) - Rename auto-update-method to auto-update-type in kickstart ([#16229](https://github.com/netdata/netdata/pull/16229), [@ilyam8](https://github.com/ilyam8)) - Remove support for Alpine 3.15 ([#16205](https://github.com/netdata/netdata/pull/16205), [@tkatsoulas](https://github.com/tkatsoulas)) - Switch to using Debian as a base for our Docker images. ([#15823](https://github.com/netdata/netdata/pull/15823), [@Ferroin](https://github.com/Ferroin)) </details> ### Documentation <a id="v1440-contributions-documentation"></a> <details> <summary><b>All changes</b></summary> - Add with-systemd-units-monitoring example to docker ([#16513](https://github.com/netdata/netdata/pull/16513), [@ilyam8](https://github.com/ilyam8)) - Add /var/log mount to docker install guide ([#16496](https://github.com/netdata/netdata/pull/16496), [@ilyam8](https://github.com/ilyam8)) - Fix spelling in Cloud on-prem documentation ([#16490](https://github.com/netdata/netdata/pull/16490), [@M4itee](https://github.com/M4itee)) - Fix disable or silence alerts examples in Health API doc ([#16446](https://github.com/netdata/netdata/pull/16446), [@luisj1983](https://github.com/luisj1983)) - Fix icon filename in freebsd meta ([#16441](https://github.com/netdata/netdata/pull/16441), [@shyamvalsan](https://github.com/shyamvalsan)) - Add Cloud on-prem documentation ([#16440](https://github.com/netdata/netdata/pull/16440), [@M4itee](https://github.com/M4itee)) - Fix typos in Health reference doc ([#16439](https://github.com/netdata/netdata/pull/16439), [@MrZammler](https://github.com/MrZammler)) - Adds Cloud Telegram notification metadata ([#16424](https://github.com/netdata/netdata/pull/16424), [@juacker](https://github.com/juacker)) - Fix chart labels examples in Health reference doc ([#16423](https://github.com/netdata/netdata/pull/16423), [@MrZammler](https://github.com/MrZammler)) - Improve Netdata Functions documentation ([#16421](https://github.com/netdata/netdata/pull/16421), [@shyamvalsan](https://github.com/shyamvalsan)) - Add /etc/localtime mount to docker install guide ([#16392](https://github.com/netdata/netdata/pull/16392), [@ilyam8](https://github.com/ilyam8)) - Remove 'families' from Health reference ([#16380](https://github.com/netdata/netdata/pull/16380), [@ilyam8](https://github.com/ilyam8)) - Fix indentation in Cloud AWA SNS notification meta ([#16379](https://github.com/netdata/netdata/pull/16379), [@ilyam8](https://github.com/ilyam8)) - Remove alert guides that are not in the repository.([#16375](https://github.com/netdata/netdata/pull/16375), [@ilyam8](https://github.com/ilyam8)) - Remove unused cloud notification methods docs ([#16372](https://github.com/netdata/netdata/pull/16372), [@ilyam8](https://github.com/ilyam8)) - Add configuration documentation for Cloud AWS SNS ([#16371](https://github.com/netdata/netdata/pull/16371), [@car12o](https://github.com/car12o)) - Correct time unit for dbengine tier 2 explanation ([#16368](https://github.com/netdata/netdata/pull/16368), [@sepek](https://github.com/sepek)) - Add assorted improvements to the version policy draft ([#16362](https://github.com/netdata/netdata/pull/16362), [@Ferroin](https://github.com/Ferroin)) - Import alert guides from Netdata Assistant ([#16355](https://github.com/netdata/netdata/pull/16355), [@ralphm](https://github.com/ralphm)) - Update packaging instructions ([#16344](https://github.com/netdata/netdata/pull/16344), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix readme images ([#16327](https://github.com/netdata/netdata/pull/16327), [@Ancairon](https://github.com/Ancairon)) - Fix nightly tag in helm deploy meta ([#16326](https://github.com/netdata/netdata/pull/16326), [@ilyam8](https://github.com/ilyam8)) - Added section Blog posts to readme ([#16323](https://github.com/netdata/netdata/pull/16323), [@Aliki92](https://github.com/Aliki92)) - Add a note for the docker deployment alongside with cetus ([#16312](https://github.com/netdata/netdata/pull/16312), [@tkatsoulas](https://github.com/tkatsoulas)) - Fix missing privileges and mounts in Docker Swarm deploy doc ([#16308](https://github.com/netdata/netdata/pull/16308), [@ilyam8](https://github.com/ilyam8)) - Use proper icons for deploy integrations ([#16305](https://github.com/netdata/netdata/pull/16305), [@Ancairon](https://github.com/Ancairon)) - Change True/False to yes/no in collectors docs ([#16289](https://github.com/netdata/netdata/pull/16289), [@ilyam8](https://github.com/ilyam8)) - Fix 404s in markdown files ([#16285](https://github.com/netdata/netdata/pull/16285), [@Ancairon](https://github.com/Ancairon)) - Add Forward Secure Sealing in Systemd-Journal doc ([#16247](https://github.com/netdata/netdata/pull/16247), [@ktsaou](https://github.com/ktsaou)) - Fix apps plugin metric names in meta ([#16243](https://github.com/netdata/netdata/pull/16243), [@ilyam8](https://github.com/ilyam8)) - Add active/passive journald centralization without encryption guide ([#16236](https://github.com/netdata/netdata/pull/16236), [@tkatsoulas](https://github.com/tkatsoulas)) - Add document outlining our versioning policy and public API ([#16227](https://github.com/netdata/netdata/pull/16227), [@Ferroin](https://github.com/Ferroin)) - Cleanup journald centralization guide ([#16225](https://github.com/netdata/netdata/pull/16225), [@Ancairon](https://github.com/Ancairon)) - Update info about custom dashboards ([#16121](https://github.com/netdata/netdata/pull/16121), [@elizabyte8](https://github.com/elizabyte8)) - Add info to native packages docs about mirroring our repos ([#16069](https://github.com/netdata/netdata/pull/16069), [@Ferroin](https://github.com/Ferroin)) - Add getting started with netdata Cloud on-prem doc ([#15954](https://github.com/netdata/netdata/pull/15954), [@M4itee](https://github.com/M4itee) </details> ### Other Notable Changes <a id="v1440-contributions-other"></a> <details open> <summary><b>Improvements</b></summary> - Implement removal of unregistered ephemeral host chart labels ([#16486](https://github.com/netdata/netdata/pull/16486), [@stelfrag](https://github.com/stelfrag)) - Implement ephemeral hosts cleanup ([#16381](https://github.com/netdata/netdata/pull/16381), [@stelfrag](https://github.com/stelfrag)) - Implement structured logging and log to Systemd journal by default ([#16357](https://github.com/netdata/netdata/pull/16357), [@ktsaou](https://github.com/ktsaou)) - Made Streaming function available to all users ([#16346](https://github.com/netdata/netdata/pull/16346), [@ktsaou](https://github.com/ktsaou)) - Improve database corruption detection during runtime ([#16343](https://github.com/netdata/netdata/pull/16343), [@stelfrag](https://github.com/stelfrag)) - Add Brotli support for streaming ([#16287](https://github.com/netdata/netdata/pull/16287), [@ktsaou](https://github.com/ktsaou)) - Add ZSTD compression support for streaming ([#16268](https://github.com/netdata/netdata/pull/16268), [@ktsaou](https://github.com/ktsaou)) - Add script to generate self-signed-certificates for systemd-journal-remote ([#16235](https://github.com/netdata/netdata/pull/16235), [@ktsaou](https://github.com/ktsaou)) - Optimizations to make busy parents more efficient ([#16127](https://github.com/netdata/netdata/pull/16127), [@ktsaou](https://github.com/ktsaou)) - Add support for gorilla pages for tier 0 ([#15969](https://github.com/netdata/netdata/pull/15969), [@vkalintiris](https://github.com/vkalintiris)) </details> <details> <summary><b>Bug Fixes</b></summary> - Fix occasional shutdown deadlock ([#16495](https://github.com/netdata/netdata/pull/16495), [@stelfrag](https://github.com/stelfrag)) - Fix reusing TCP port when binding ([#16420](https://github.com/netdata/netdata/pull/16420), [@ilyam8](https://github.com/ilyam8)) - Fix counting spaces as CPU cores when reading `cpuset.cpus` ([#16385](https://github.com/netdata/netdata/pull/16385), [@ilyam8](https://github.com/ilyam8)) - Improve agent to cloud status update process ([#16342](https://github.com/netdata/netdata/pull/16342), [@stelfrag](https://github.com/stelfrag)) </details> <details> <summary><b>Other</b></summary> - Improve logging clarity: change log level to debug for dbengine routine operations on start ([#16518](https://github.com/netdata/netdata/pull/16518), [@ilyam8](https://github.com/ilyam8)) - Improve logging clarity: remove logging of system info ([#16517](https://github.com/netdata/netdata/pull/16517), [@ilyam8](https://github.com/ilyam8)) - Add prefix to logs-management chart names ([#16514](https://github.com/netdata/netdata/pull/16514), [@Dim-P](https://github.com/Dim-P)) - Fix "has aggregated" debug output in apps.plugin ([#16512](https://github.com/netdata/netdata/pull/16512), [@ilyam8](https://github.com/ilyam8)) - Various improvements to log2journal part 4 ([#16510](https://github.com/netdata/netdata/pull/16510), [@ktsaou](https://github.com/ktsaou)) - Improve logging clarity: reclassify specific messages as information part2 ([#16508](https://github.com/netdata/netdata/pull/16508), [@ilyam8](https://github.com/ilyam8)) - Fix coverity issue 410232 ([#16507](https://github.com/netdata/netdata/pull/16507), [@stelfrag](https://github.com/stelfrag)) - Improve logging clarity: reclassify specific messages as information part1 ([#16505](https://github.com/netdata/netdata/pull/16505), [@ilyam8](https://github.com/ilyam8)) - Fix CID 410152 dereference after null check ([#16502](https://github.com/netdata/netdata/pull/16502), [@stelfrag](https://github.com/stelfrag)) - Various improvements to log2journal part 2 ([#16494](https://github.com/netdata/netdata/pull/16494), [@ktsaou](https://github.com/ktsaou)) - Fix builds on macOS due to missing endianness functions ([#16489](https://github.com/netdata/netdata/pull/16489), [@vkalintiris](https://github.com/vkalintiris)) - Add missing yaml elements to log2journal ([#16488](https://github.com/netdata/netdata/pull/16488), [@ktsaou](https://github.com/ktsaou)) - Add option to submit logs to systemd journal to logs-management ([#16485](https://github.com/netdata/netdata/pull/16485), [@Dim-P](https://github.com/Dim-P)) - Add function cancellability to logs-management ([#16484](https://github.com/netdata/netdata/pull/16484), [@Dim-P](https://github.com/Dim-P)) - Move log2journal to collectors/ ([#16481](https://github.com/netdata/netdata/pull/16481), [@ktsaou](https://github.com/ktsaou)) - Add YAML configuration support to log2journal ([#16479](https://github.com/netdata/netdata/pull/16479), [@ktsaou](https://github.com/ktsaou)) - Log alarm notifications to health.log ([#16476](https://github.com/netdata/netdata/pull/16476), [@ktsaou](https://github.com/ktsaou)) - Cleanup systemd-journald plugin code ([#16475](https://github.com/netdata/netdata/pull/16475), [@ktsaou](https://github.com/ktsaou)) - Check context post processing queue before sending status to Cloud ([#16472](https://github.com/netdata/netdata/pull/16472), [@stelfrag](https://github.com/stelfrag)) - Fix error limit to respect the log every ([#16469](https://github.com/netdata/netdata/pull/16469), [@stelfrag](https://github.com/stelfrag)) - Fix analytics logs ([#16462](https://github.com/netdata/netdata/pull/16462), [@ktsaou](https://github.com/ktsaou)) - Fix logs bashism ([#16461](https://github.com/netdata/netdata/pull/16461), [@ktsaou](https://github.com/ktsaou)) - Fix log2journal incorrect log ([#16460](https://github.com/netdata/netdata/pull/16460), [@ktsaou](https://github.com/ktsaou)) - Fix various logging environment variables issues ([#16459](https://github.com/netdata/netdata/pull/16459), [@ktsaou](https://github.com/ktsaou)) - Add environment variable for the Journald socket path ([#16457](https://github.com/netdata/netdata/pull/16457), [@ktsaou](https://github.com/ktsaou)) - Fix missing argument on log init in xenstat plugin ([#16451](https://github.com/netdata/netdata/pull/16451), [@vkalintiris](https://github.com/vkalintiris)) - Change log flood protection to 1000 log lines / 1 minute ([#16450](https://github.com/netdata/netdata/pull/16450), [@ilyam8](https://github.com/ilyam8)) - Change the interface_speed alarm value to Mbit ([#16429](https://github.com/netdata/netdata/pull/16429), [@ilyam8](https://github.com/ilyam8)) - Improve logging clarity: removing logging errors from reading filtered alerts ([#16417](https://github.com/netdata/netdata/pull/16417), [@MrZammler](https://github.com/MrZammler)) - Bring back chart id to `title` in /api/v1/charts ([#16416](https://github.com/netdata/netdata/pull/16416), [@ilyam8](https://github.com/ilyam8)) - Fix an issue where reused connections were counted as new ([#16414](https://github.com/netdata/netdata/pull/16414), [@ilyam8](https://github.com/ilyam8)) - Remove queue limit from ACLK sync event loop ([#16411](https://github.com/netdata/netdata/pull/16411), [@stelfrag](https://github.com/stelfrag)) - Implement using /host/etc/hostname in Docker ([#16401](https://github.com/netdata/netdata/pull/16401), [@ilyam8](https://github.com/ilyam8)) - Fix v0 dashboard ([#16389](https://github.com/netdata/netdata/pull/16389), [@ilyam8](https://github.com/ilyam8)) - Use pre-configured message_ids to identify common logs ([#16383](https://github.com/netdata/netdata/pull/16383), [@ktsaou](https://github.com/ktsaou)) - Switch alarm_log to use the buffer json functions ([#16360](https://github.com/netdata/netdata/pull/16360), [@stelfrag](https://github.com/stelfrag)) - Switch to using buffer json functions in charts/chart endpoints ([#16359](https://github.com/netdata/netdata/pull/16359), [@stelfrag](https://github.com/stelfrag)) - Replace rrdset_is_obsolete & rrdset_isnot_obsolete ([#16351](https://github.com/netdata/netdata/pull/16351), [@MrZammler](https://github.com/MrZammler)) - Add rrddim_get_last_stored_value to simplify function code in internal collectors ([#16348](https://github.com/netdata/netdata/pull/16348), [@ilyam8](https://github.com/ilyam8)) - Add api/v2 support to h2o ([#16340](https://github.com/netdata/netdata/pull/16340), [@underhood](https://github.com/underhood)) - Improve unittests ([#16329](https://github.com/netdata/netdata/pull/16329), [@stelfrag](https://github.com/stelfrag)) - Fix coverity warnings in cgroups ([#16328](https://github.com/netdata/netdata/pull/16328), [@ilyam8](https://github.com/ilyam8)) - Rename newly added functions ([#16325](https://github.com/netdata/netdata/pull/16325), [@ktsaou](https://github.com/ktsaou)) - Improve performance of alarm log queries by keeping precompiled statements ([#16321](https://github.com/netdata/netdata/pull/16321), [@stelfrag](https://github.com/stelfrag)) - Fix journal file index when collision is detected ([#16319](https://github.com/netdata/netdata/pull/16319), [@stelfrag](https://github.com/stelfrag)) - Optimize database before agent shutdown ([#16317](https://github.com/netdata/netdata/pull/16317), [@stelfrag](https://github.com/stelfrag)) - Improve shutdown when collectors are active ([#16315](https://github.com/netdata/netdata/pull/16315), [@stelfrag](https://github.com/stelfrag)) - Fix using absolute path for echo in netdata-claim.sh ([#16300](https://github.com/netdata/netdata/pull/16300), [@ilyam8](https://github.com/ilyam8)) - Fix various issues identified by coverity ([#16294](https://github.com/netdata/netdata/pull/16294), [@ktsaou](https://github.com/ktsaou)) - Fix renames in freebsd ([#16292](https://github.com/netdata/netdata/pull/16292), [@ktsaou](https://github.com/ktsaou)) - Fix retention loading ([#16290](https://github.com/netdata/netdata/pull/16290), [@ktsaou](https://github.com/ktsaou)) - Add cmd args for reading specific files to local_listeners ([#16273](https://github.com/netdata/netdata/pull/16273), [@ilyam8](https://github.com/ilyam8)) - Fix the issue of streaming disconnection when sending REPORT_JOB_STATUS ([#16272](https://github.com/netdata/netdata/pull/16272), [@underhood](https://github.com/underhood)) - Fix sources match in systemd-journal plugin ([#16271](https://github.com/netdata/netdata/pull/16271), [@ktsaou](https://github.com/ktsaou)) - Fix coverity issue 403725 ([#16265](https://github.com/netdata/netdata/pull/16265), [@stelfrag](https://github.com/stelfrag)) - Improve dimension ML model load ([#16262](https://github.com/netdata/netdata/pull/16262), [@stelfrag](https://github.com/stelfrag)) - Improvements to Dyncfg ([#16250](https://github.com/netdata/netdata/pull/16250), [@ktsaou](https://github.com/ktsaou)) - Drop an unused index from aclk_alert table ([#16242](https://github.com/netdata/netdata/pull/16242), [@stelfrag](https://github.com/stelfrag)) - Add DYNCFG_RESET command to Dynamic configuration ([#16241](https://github.com/netdata/netdata/pull/16241), [@underhood](https://github.com/underhood)) - Reuse ML load prepared statement ([#16240](https://github.com/netdata/netdata/pull/16240), [@stelfrag](https://github.com/stelfrag)) - Fix meta unittest ([#16221](https://github.com/netdata/netdata/pull/16221), [@stelfrag](https://github.com/stelfrag)) - Minimize hashtable collisions in facets ([#16215](https://github.com/netdata/netdata/pull/16215), [@ktsaou](https://github.com/ktsaou)) - Improve context load on startup ([#16203](https://github.com/netdata/netdata/pull/16203), [@stelfrag](https://github.com/stelfrag)) - Add support for h2o evloop netdata stream ([#14868](https://github.com/netdata/netdata/pull/14868), [@underhood](https://github.com/underhood)) - Add Logs Management ([#13291](https://github.com/netdata/netdata/pull/13291), [@Dim-P](https://github.com/Dim-P)) </details> ## Deprecation notice <a id="v1440-deprecation-notice"></a> ### Changed in this release In accordance with our previous [deprecation notice](https://github.com/netdata/netdata/releases/tag/v1.43.0#v1430-deprecation-notice), the following items in this release have been changed: - The [charts.d/nut](https://github.com/netdata/netdata/tree/v1.42.4/collectors/charts.d.plugin/nut#readme) collector has been removed. Replacement - [go.d/upsd](https://github.com/netdata/go.d.plugin/tree/master/modules/upsd#ups-nut). Other unannounced changes: - Netdata internal metrics (Netdata Monitoring section) are disabled by default to reduce the overall data volume. Later we plan to enable only important internal metrics by default. Can be enabled in `netdata.conf` by uncommenting and changing `no` to `yes`: ```ini [plugins] # netdata monitoring = no # netdata monitoring extended = no ``` - Logging - Logs format changed to [logfmt](https://brandur.org/logfmt). - Default logging destination changed to systemd-journal (systemd-only): logs are now sent to the "netdata" namespace in systemd-journal. Systemd-journal provides a centralized repository for all system logs, making it easier to manage and search for logs. To override the default behavior and continue using the file-based logging, refer to the `netdata.conf` file and make the necessary changes under the `[logs]` section. - File-based logging: error.log renamed to daemon.log. ### Will be changed in the next release - To ensure seamless compatibility with future updates, we recommend transitioning from [source-built installations](https://learn.netdata.cloud/docs/installing/build-the-netdata-agent-yourself/compile-from-source-code) to our distribution packages or static binaries. Starting with our next release, we will no longer guarantee compatibility when updating source-built installations. This change allows us to focus on enhancing the stability and feature delivery for the rest of our supported [installation methods](https://learn.netdata.cloud/docs/installing/). - [Gorilla](#v1440-release-highlights-gorilla) compression will be enabled by default. - The [Google Cloud Pub Sub](https://learn.netdata.cloud/docs/exporting/google-cloud-pub-sub) and the [AWS Kinesis](https://learn.netdata.cloud/docs/exporting/aws-kinesis) exporters will be removed in the next release. Both of them were not maintained and were not used when building packages. Users can consult the [exporting documentation](https://learn.netdata.cloud/docs/exporting/exporting-quickstart) for alternative exporters to use. - The [database modes](https://learn.netdata.cloud/docs/configuring/optimizing-metrics-database/database-modes-for-parent-child-setups) `map` and `save` will be removed in the next release. The `dbengine` database mode will be used to persist metrics on disk automatically. - Per-core CPU metrics will be disabled by default to reduce data volume. Summary (per-system) metrics are still collected. This change enhances performance and resource utilization. Disabled metrics: - `cpu.cpu` (utilization). - `cpu.interrupts` (all interrupts). - `cpu.softirqs` (software interrupts). - `cpu.softnet_stat` (software interrupts related to network receive work). - `cpu.cpu_cstate_residency_time` (idle states). Can be enabled in `netdata.conf` by uncommenting and changing `no` to `yes`: ```ini [plugin:proc:/proc/stat] # per cpu core utilization = no # cpu idle states = no [plugin:proc:/proc/interrupts] # interrupts per core = no [plugin:proc:/proc/softirqs] # interrupts per core = no [plugin:proc:/proc/net/softnet_stat] # softnet_stat per core = no ``` - To optimize system performance, several eBPF.plugin modules have been disabled by default. While these modules provide valuable insights into system resource usage, they can also contribute to system overhead. They will expose metrics using Functions (run on demand and for a limited period of time). These modules include: - cachestat - fd - process - oomkill - shm - swap ## Netdata Release Meetup <a id="v1440-netdata-release-meetup"></a> Join the Netdata team on the 11th of December at 16:30 UTC for the [Netdata Release Meetup](https://www.meetup.com/netdata/events/297755934/). Together we’ll cover: * Release Highlights. * Acknowledgments. * Q&A with the community. [RSVP now](https://www.meetup.com/netdata/events/297755934/) - we look forward to meeting you. ## Support options <a id="v1440-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1800 engineers are already using it! 2023-12-06T18:15:03+00:00 netdata v1.44.1 netdata v1.44.1 2023-12-12T18:54:09+00:00 Netdata v1.44.1 is a patch release to address issues discovered since [v1.44.0](https://github.com/netdata/netdata/releases/tag/v1.44.0). This patch release provides the following bug fixes and updates: - Fixed an issue in the uninstall script that prevented log2journal and systemd-cat-native from being removed ([#16585](https://github.com/netdata/netdata/pull/16585), [@ilyam8](https://github.com/ilyam8)). - Fixed a bug that caused the debugfs.plugin to not terminate upon receiving a SIGPIPE (Broken Pipe) signal ([#16569](https://github.com/netdata/netdata/pull/16569), [@ilyam8](https://github.com/ilyam8)). - Fixed memory leak during host chart label cleanup ([#16568](https://github.com/netdata/netdata/pull/16568), [@stelfrag](https://github.com/stelfrag)). - Fixed incorrect cpu architecture/ram/disk values in build info ([#16567](https://github.com/netdata/netdata/pull/16567), [@ilyam8](https://github.com/ilyam8)). - Fixed a bug that prevented the parent from accepting streaming connections on systems with one CPU core ([#16565](https://github.com/netdata/netdata/pull/16565), [@stelfrag](https://github.com/stelfrag)). - Make the systemd-journal mandatory package on Centos 7 and Amazon linux 2 ([#16562](https://github.com/netdata/netdata/pull/16562), [@tkatsoulas](https://github.com/tkatsoulas)). - Fixed crash on reading memory clock speed of an AMD graphics card ([#16561](https://github.com/netdata/netdata/pull/16561), [@MrZammler](https://github.com/MrZammler)). - Fixed an unhandled error that occurred when setting file capabilities in the Debian postinst script of the perf.plugin ([#16558](https://github.com/netdata/netdata/pull/16558), [@tkatsoulas](https://github.com/tkatsoulas)). - Fixed an issue where the user's netdata home directory was set to an incorrect value ([#16548](https://github.com/netdata/netdata/pull/16548), [@ilyam8](https://github.com/ilyam8)). - Added the lightweight text editor to the Docker image ([#254](https://github.com/netdata/helper-images/pull/254), [@tkatsoulas](https://github.com/tkatsoulas)). ## Support options <a id="v1441-support-options"></a> As we grow, we stay committed to providing the best support ever seen from an open-source solution. Should you encounter an issue with any of the changes made in this release or any feature in the Netdata Agent, feel free to contact us through one of the following channels: - [Netdata Learn](https://learn.netdata.cloud): Find documentation, guides, and reference material for monitoring and troubleshooting your systems with Netdata. - [GitHub Issues](https://github.com/netdata/netdata/issues): Make use of the Netdata repository to report bugs or open a new feature request. - [GitHub Discussions](https://github.com/netdata/netdata/discussions): Join the conversation around the Netdata development process and be a part of it. - [Community Forums](https://community.netdata.cloud/): Visit the Community Forums and contribute to the collaborative knowledge base. - [Discord Server](https://discord.gg/2eduZdSeC7): Jump into the Netdata Discord and hang out with like-minded sysadmins, DevOps, SREs, and other troubleshooters. More than 1700 engineers are already using it! 2023-12-12T18:54:09+00:00