Aug 1, 2011

Tuning Zabbix to improve its performance (II)

Let's continue with the last article about tuning Zabbix to improve its performance. First of all, I am going to set the suitable kernel parameters into the sysctl.conf file.

root@zbx01:~# cat /etc/sysctl.conf
...
# Maximum percentage of physical memory usage before going to swap
vm.swappiness = 10

# Number of open files for all processes
fs.file-max = 407020

# Minimum, default and maximum size of the send/receive buffer used by each TCP socket
net.ipv4.tcp_wmem = 8192        87380   16777216
net.ipv4.tcp_rmem = 8192        87380   16777216

# Maximum number of queued connection requests which have still not received an ACK (three-way handshake)
net.ipv4.tcp_max_syn_backlog = 2048

# Number of seconds to wait for a final FIN packet before the socket is forcibly closed
net.ipv4.tcp_fin_timeout = 25

# Number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes
net.ipv4.tcp_keepalive_time = 1200

# Maximum TCP send window
net.core.wmem_max = 16777216

# Maximum TCP receive window
net.core.rmem_max = 16777216

# Maximum size in bytes of a message queue
kernel.msgmnb = 65536

# Maximum size for a message text
kernel.msgmax = 65536

# Maximum size in bytes for a shared memory segment
kernel.shmmax = 68719476736

# System wide maximum of shared memory pages
kernel.shmall = 4294967296

Then I am going to fit the values of MySQL by means of its configuration file. This part is really important if you want to achieve a good performance.

In order to adjust them, I have been following the status of the database throughout several weeks, by using tuning tools such as MySQL Performance Tuning Primer Script or MySQLTuner.

root@zbx01:~# cat /etc/mysql/my.cnf
...
# Size of the buffer used for index blocks
key_buffer = 16M

# Maximum size of one packet or any generated/intermediate string
max_allowed_packet = 16M

# Number of threads the server should cache for reuse
thread_cache_size = 64

# Maximum allowed number of simultaneous client connections
max_connections = 256

# Number of open tables for all threads
table_cache = 1024

# Number of table definitions that can be stored in the definition cache
table_definition_cache = 1024

# Do not cache results that are larger than this number of bytes
query_cache_limit = 16M

# Amount of memory allocated for caching query results
query_cache_size = 1024M

# Minimum size (in bytes) for blocks allocated by the query cache
query_cache_min_res_unit = 512

# 0: do not cache
# 1: cache all cacheable query results except for those that begin with SELECT SQL_NO_CACHE
# 2: cache results only for cacheable queries that begin with SELECT SQL_CACHE
query_cache_type = 1

# Slow queries are logged
log_slow_queries = /var/log/mysql/mysql-slow.log

# If a query takes longer than this value (seconds), the server logs the query
long_query_time = 5

# Queries that are expected to retrieve all rows are logged
log-queries-not-using-indexes

# Size in bytes of the memory buffer that InnoDB uses to cache data and indexes of its tables
innodb_buffer_pool_size = 4096M

With respect to MySQL, stand out that it is also important to defragment the query cache to enhance its utilization, by carrying out a "flush query cache" on the database. In my installation, I have seen that the optimum period is every hour.

root@zbx01:~# crontab -e
...
0 */1 * * * mysql -u root -pxxxxxx -e "flush query cache"

And finally, I have changed certain parameters from the Zabbix configuration file. The most important variable is related to the pre-forked pollers.

If this number is not enough, your Zabbix server will not be able to save all monitored data and you will find lack of many values. This is due to if the server runs out of sufficient processes to attend the requests, they will be ruled out and not registered.

root@zbx01:~# cat /etc/zabbix/zabbix_server.conf
...
# Number of pre-forked instances of pollers
StartPollers=96

# Shared memory size for storing hosts and items data
CacheSize=64M

# Shared memory size for storing history data
HistoryCacheSize=8M

# Shared memory size for storing trends data.
TrendCacheSize=8M

# Shared memory size for storing character, text or log history data
HistoryTextCacheSize=8M

Regarding Housekeeping, I have not modified any default parameter. In this way, the housekeeping procedure runs every hour and deletes all unnecessary values into the database.

If you note that your server does not work properly because it is using up lots of resources (CPU, memory, I/O) in this task, you will have to fit these options.

root@zbx01:~# cat /etc/zabbix/zabbix_server.conf
...
# Housekeeping is removing unnecessary information from history, alert, and alarm tables
# HousekeepingFrequency=1

# No more than MaxHousekeeperDelete rows will be deleted per one task in one housekeeping cycle
# MaxHousekeeperDelete=500

# Enable/disable housekeeping
# DisableHousekeeping=0


5 comments:

  1. thanks for your post. how did you come by your /etc/sysctl.conf & server.conf settings?

    ReplyDelete
  2. About the sysctl.conf file, this is a typical configuration that I am used to setting when I install a Linux server, in order to improve its performance.

    For the second one, I followed those values throughout several days so as to study their behavior (in fact, they are items that you can get from Zabbix), and in this way, I was able to fit them correctly (the key is that you must assign them a enough margin).

    ReplyDelete
  3. I have to say the query cache settings for mysql are bad

    it gets much more expensive in cpu time and performance to clear the query cache when its higher than around 128MB and doubly worse you are recommending clearing it every hour, its better to save that ram for a larger innodb buffer or other more useful mysql caches like the tmp_tables_cache and max_heap_tables_cache, i'd also recommend: innodb_file_per_table as well

    i'd also disable log_slow queries unless you want to get into debugging and developing zabbix as you dont need that performance hit if you can help it

    ReplyDelete
    Replies
    1. e.g. here is my mysql settings that are different from default

      skip-name-resolve #your only connecting from localhost right? either way this reduces the dns lookup step which does improve performance in many cases
      table_cache = 1024
      thread_concurrency = 10

      query_cache_limit = 4M
      query_cache_size = 64M #i find your only really using this for reading the data not inserting the data which is the majority of Zabbix server's work

      tmp_table_size =128M
      max_heap_table_size =128M

      innodb_file_per_table
      innodb_buffer_pool_size=3G #i have an 8GB server so this is generally ok for me - this varies depending on your ram
      innodb_flush_log_at_trx_commit=0 #ONLY do this if your not worried about ACID compliance (i generally find it acceptable to lose a few seconds of logging data in the event of a crash for the sake of performance)
      innodb_thread_concurrency=4

      Delete
    2. You are totally wrong.

      Each configuration depends on the kind of installation, and my platform had to support more than 400 devices. In the first article, if you take a look at the first picture, you will see that the Required server performance, new values per second, in my case is 1687. Could you tell me your value? Are we talking about the same kind of installation?

      When you have to process more than 1600 values per second, if you do not assign the enough memory to your cache, your performance will be ridiculous. And if I have to clear the cache every hour, it is because the Qcache_free_blocks parameter was indicating (in my case), that the memory (cache) was getting fragmented quickly. And a fragmented memory does not work properly.

      In order to get the values for MySQL, the installation was running several weeks so as to establish it, and then, I run MySQL Performance Tuning Primer Script and MySQLTuner in order to obtain the best parameters.

      Also say that if you consider that to clear the cache consumes much CPU and memory, you are wrong as well. You can check this operation with strace.

      Delete