Prometheus, JMX, and Zookeeper

Posted on Sunday, May 21, 2017



 I am still poking around Prometheus.   So far I am liking it a lot. 


In this article I am going to go over how to get it connected to JMX in Zookeeper to start getting metrics out of those systems for monitoring.  I did another article on how to get it set up with Kafka  ( see http://www.whiteboardcoder.com/2017/04/prometheus-and-jmx.html [1] )






Be forewarned, this is going to be a long article as I poke at this tool to get it working the way I want it to.





Prometheus JMX Exporter


JXM Exporter located at https://github.com/prometheus/jmx_exporter/ [3] on GitHub.  It is a lightweight http server that exposes JMX data as Prometheus compatible metrics that it can then scrape.






VisualVM and JMX


First thing I want to do is to turn jmx on for the zookeeper.  Once I have this set up then I can download and use VisualVM to connect to zookeeper and see some stats.

I have a very basic set up of kafka/zookeeper poking around I found an environment file that is used in the init script /etc/zookeeper/conf/environment



From /etc/init.d/zookeeper

I am going to edit that file so that it will turn on JMX


  > sudo vi /etc/zookeeper/conf/environment




And update the JAVA_OPTS section with this


#Should retrieve local IP address
IP_ADDR=`ip route get 8.8.8.8 | awk '{print $NF; exit}'`
JAVA_OPTS="$JAVA_OPTS -Djava.rmi.server.hostname=$IP_ADDR"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.port=9696"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
JAVA_OPTS="$JAVA_OPTS -Dcom.sun.management.jmxremote.ssl=false"




Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart




Let's see if that worked.


  > sudo ps -Af | grep 9696




That appears to be working.

Now onto using VisualVM to connect into it.





Download Visual VM






And Download
Unzip it and start it up.





Accept the license





Add a JMX Connection.






Enter in the IP address and port  (In my case the kafka/zookeeper server lives at 192.168.0.140) .  Then click OK





Now you should have this.
Double Click on it.




Click on the monitor tab and you can now see stuff scrolling by.


 


Go to Tools -> Plugins
 


Select the Available Plugins tab and checkbox the VisualVM-MBeans and click Install.


 


Next



 



Accept the license and click Install


 


Finish


 

Close the connection




Now you should have an MBeans tab


 


There are your mbeans J
This is a standalone zookeeper so my specific zookeeper mbeans are very  limited.




Here is a zookeeper page that lists their mbeans in JMX



The bottom section is the standalone mbeans.




There are some values.






Install Prometheus JMX exporter and get it set up






Download the jar file


I am going to make a folder to do all this work in.


  > cd
  > mkdir jmx_exporter_zookeeper
  > cd jmx_exporter_zookeeper
  > wget https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.9/jmx_prometheus_javaagent-0.9.jar


Create a configuration file in yaml format


  > vi zookeeper_kafkfa.yml


This is going to be a very very basic yaml file to start with.  It will read in all the mbeans data and use default setttings for them.


---
lowercaseOutputName: true








Adding jmx exporter as javaagent to zookeeper


Edi the same zookeeper environment file again.


  > sudo vi /etc/zookeeper/conf/environment


And add this to the JAVA_OPTS section


##JMX Exporter Agent
JMX_DIR="/home/patman/jmx_exporter_zookeeper"
JAVA_OPTS="$JAVA_OPTS -javaagent:$JMX_DIR/jmx_prometheus_javaagent-0.9.jar=1234:$JMX_DIR/zookeeper_kafkfa.yml"




Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart




Now try and curl the /metrics


  > curl localhost:1234/metrics


 


Wahoo it worked!


But we have at least two problems.

1.      We have all the metrics, and we may not want them all
2.      All zookeepermetrics are being set as gauges, some need to be labeled as counters.

So how do we fix that? 

We fix it by tweaking the config yaml file we made.




Filtering the data


First thing run a few curls to see how much data points we are getting.


This will list all the individual metrics and remove the comments/settings that start with a # symbol


  > curl -s localhost:1234/metrics | grep -v "^#"







Get a total count


  > curl -s localhost:1234/metrics | grep -v "^#" | wc -l




I have a total of 287 records.
These records consist of generic java jvm info like memory size etc and some is specific to zookeeper.


Get a total count of just zookeeper data


  > curl -s localhost:1234/metrics | grep -v "^#" | grep zookeeper | wc -l






I can add a whitelist, that will filter on class names, to the yaml file

Here is my first go at it.



  > vi zookeeper_kafkfa.yml


Update it to the following


---
lowercaseOutputName: true
whitelistObjectNames: ["java.lang.OperatingSystem:*"]




Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart




See how many stats I have now


  > curl -s localhost:1234/metrics | grep -v "^#" | wc -l




Down to 48 and if you look at them all they are just java_lang stats.





Second go at it (just getting



  > vi zookeeper_kafkfa.yml


Update it to the following  (only get the zookeeper metrics)


---
lowercaseOutputName: true
whitelistObjectNames: ["org.apache.ZooKeeperService:*"]





Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart




See how many stats I have now


  > curl -s localhost:1234/metrics | grep -v "^#" | wc -l




I got 69 back … and if you look at it.



I have zookeeper metrics and I also have the java jvm data.  Turns out no matter what whitelist you set you will always get the jvm data.

Which is fine by me.






JMX trickiness


Now for the fun part.  I need to further filter out the zookeeper data and also set some as counters.



I am going to start with this one metric  StandaloneServer_port2181:PacketsReceived.

This metric is a counter, not a gauge, it always increases.

I want to fix my yaml file so I only get this zookeeper metric and it shows up as a counter.



Running this curl I can see it shows up as a gauge.


  > curl -s localhost:1234/metrics | grep -B 2 packetsreceived




I also happen to see the other specific packets received per server connected at




I think I will try and get both of them if I can, but only those from the zookeeper data.





Fun part


To filter out this data you need to know three things from the JMX data.

I need to pick out three parts to be used in a whitelistObjectNames

1.      Domain
2.      Type
3.      Name





Select Standalone



Click on Metadata and look at ObjectName




1.      Domain




In this case the domain = org.apache.ZooKeeperService




2.      Type




This appears to have not type

In this case type = *



3.      Name



In this case the name0 = StandaloneServer_port2181


I can use these first three parts and create a whitelistObjectNames.



  > vi zookeeper_kafkfa.yml


Here is a simple yaml file using that.



---
lowercaseOutputName: true
whitelistObjectNames: ["org.apache.ZooKeeperService:name0=StandaloneServer_port2181,*"]



Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart


Running this curl I can see it shows up as a gauge.


  > curl -s localhost:1234/metrics | grep -v "#" | grep zoo | wc -l




This gave me 21 , so no reduction yet.




More tweaking setting rules

  

  > vi zookeeper_kafkfa.yml


Update it to.



---
lowercaseOutputName: true
#whitelistObjectNames: ["org.apache.ZooKeeperService:*"]
whitelistObjectNames: ["org.apache.ZooKeeperService:name0=StandaloneServer_port2181,*"]
rules:
   - pattern: org.apache.ZooKeeperService<name0=(.+)><>(PacketsReceived)
     name: zookeeper_server_$1
     type: COUNTER


Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart


Running this curl


  > curl -s localhost:1234/metrics | grep -v "#" | grep zoo | wc -l


 

Down to 2!

If I look at it

 


I can see I have two metrics that are both marked as counters.  So that part is good.  The second metric is coming from here.

 


I do want this data but I do not like how it is being named

zookeeper_server_standaloneserver_port2181_name1_connections_name2_0:0:0:0:0:0:0:1_name3_0x15c1706051b0000

I need to fiddle with the config file again.



  > vi zookeeper_kafkfa.yml


Update it to.



---
lowercaseOutputName: true
#whitelistObjectNames: ["org.apache.ZooKeeperService:*"]
whitelistObjectNames: ["org.apache.ZooKeeperService:name0=StandaloneServer_port2181,*"]
rules:
   - pattern: org.apache.ZooKeeperService<name0=(.+).name1=(.+).name2=(.+).name3=(.+)><>(PacketsReceived)
     name: zookeeper_server_$2_$3_$5
     type: COUNTER
   - pattern: org.apache.ZooKeeperService<name0=(.+)><>(PacketsReceived)
     name: zookeeper_server_$1_$2
     type: COUNTER


I added a new rule that will find this second one.   The Patterns are applied in order and the first one that fits will be used.

Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart


Running this curl


  > curl -s localhost:1234/metrics | grep -v "#" | grep zoo





Exactly what I wanted J





More tweaking All Counters


Let me see which other data points I want that are COUNTERS.






Looks like I only want PacketsReceived and PacketsSent for all my COUNTERS.



  > vi zookeeper_kafkfa.yml


Update it to.



---
lowercaseOutputName: true
whitelistObjectNames: ["org.apache.ZooKeeperService:name0=StandaloneServer_port2181,*"]
rules:
   - pattern: org.apache.ZooKeeperService<name0=(.+).name1=(.+).name2=(.+).name3=(.+)><>(PacketsReceived|PacketsSent)
     name: zookeeper_server_$2_$3_$5
     type: COUNTER
   - pattern: org.apache.ZooKeeperService<name0=(.+)><>(PacketsReceived|PacketsSent)
     name: zookeeper_server_$1_$2
     type: COUNTER


I added a new rule that will find this second one.   The Patterns are applied in order and the first one that fits will be used.

Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart




Running this curl


  > curl -s localhost:1234/metrics | grep -v "#" | grep zoo




Wahoo working!





More tweaking All Gauges


Let me see which other data points I want that are just normal gauges?




AvgRequestLatency
MaxRequestLatency
MinRequestLatency
NumAliveConnections
OutstandingRequests





AvgLatency
MaxLatency
MinLatency
OutstandingRequests





  > vi zookeeper_kafkfa.yml


Update it to.


---
lowercaseOutputName: true
whitelistObjectNames: ["org.apache.ZooKeeperService:name0=StandaloneServer_port2181,*"]
rules:
   - pattern: org.apache.ZooKeeperService<name0=(.+).name1=(.+).name2=(.+).name3=(.+)><>(PacketsReceived|PacketsSent)
     name: zookeeper_server_$2_$3_$5
     type: COUNTER
   - pattern: org.apache.ZooKeeperService<name0=(.+)><>(PacketsReceived|PacketsSent)
     name: zookeeper_server_$1_$2
     type: COUNTER
   - pattern: org.apache.ZooKeeperService<name0=(.+).name1=(.+).name2=(.+).name3=(.+)><>(AvgLatency|MaxLatency|MinLatency|OutstandingRequests)
     name: zookeeper_server_$2_$3_$5
   - pattern: org.apache.ZooKeeperService<name0=(.+)><>(AvgRequestLatency|MaxRequestLatency|MinRequestLatency|NumAliveConnections|OutstandingRequests)
     name: zookeeper_server_$1_$2





Now restart zookeeper


  > sudo /etc/init.d/zookeeper restart



Running this curl


  > curl -s localhost:1234/metrics | grep -v "#" | grep zoo | sort





Wahoo working!


References


[1]        Prometheus and JMX
[2]        Monitoring Apache Kafka with Prometheus
[3]        Monitoring Kafka with Prometheus
[4]        Zookeeper Mbeans
[5]        Prometheus git page


2 comments: