Oct 12, 2012

failed to flush the buffer, retrying. error="no nodes are available" fluentd

I got other error message from fluentd.
  failed to flush the buffer, retrying. error="no nodes are available"

The detail message is the following.
2012-10-12 15:57:23 +0900: failed to flush the buffer, retrying. error="no nodes are available" instance=23456250987700
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/plugin/out_forward.rb:137:in `write_objects'
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/output.rb:439:in `write'
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/buffer.rb:279:in `write_chunk'
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/buffer.rb:263:in `pop'
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/output.rb:303:in `try_flush'
2012-10-12 15:57:23 +0900: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.25/lib/fluent/output.rb:120:in `run'
view raw gistfile1.txt hosted with ❤ by GitHub


The reason
flunetd uses UDP for health check. But my target servers are on staging environment and collecting server is on production environment. They are on different segment. UDP is prohibited by our network policy.

The solution
Configure your Fire Wall to permit UDP over different segment.

fluentd unexpected error error="Address already in use - bind(2)"

When I started fluentd, I got the error message like the following.

2012-10-12 14:52:07 +0900: unexpected error error="Address already in use - bind(2)"

I searched which program was using this port. Yes, as you see it was td-agent(fluentd).
td-agent is supposed to use 24224 port. It's natural behavior.
[xxx]# sudo /usr/sbin/lsof -i:24224
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
ruby 22248 td-agent 14u IPv4 46635887 TCP *:24224 (LISTEN)
ruby 22248 td-agent 15u IPv4 46635888 UDP *:24224
view raw gistfile1.sh hosted with ❤ by GitHub


But I forgot that I booted it by /usr/sbin/td-agent command before.
Yes it's absolutely my mistake. I killed this process and restart fluentd.