Skip to content

httpFS - Can not create file when it does not exist #72

@hahafamilia

Description

@hahafamilia
<match fluentd.test>
   @type webhdfs
   path /tmp/fluentd/test/test.log
   host myhttpfs.example.com
   port 14000
   httpfs true
   username admin
   flush_interval 5s
</match>
2020-01-30 19:17:31 +0900 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2020-01-30 19:17:32 +0900 chunk="59d58c35c3f9c0fc061dabc8b3243994" error_class=WebHDFS::ServerError error="Failed to connect to host myhttpfs.example.com:14000, end of file reached"
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:345:in `rescue in request'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:342:in `request'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:273:in `operate_requests'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:73:in `create'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:274:in `rescue in send_data'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:271:in `send_data'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:389:in `block in write'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:335:in `compress_context'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:388:in `write'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1125:in `try_flush'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1431:in `flush_thread_run'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:461:in `block (2 levels) in start'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'                                                                                                  2020-01-30 19:17:34 +0900 [warn]: #0 failed to communicate hdfs cluster, path: /tmp/fluentd/20200130/access.log

I am using Cloudera CDH 6.1.
I have configured the plugin to use 'httpfs'.
Plugin can not create file When the file does not exist.
I read the issues-46.
I think I found the cause in the cloudera document.
Please Can you check this link?

Create and Write to a file

Note that the reason of having two-step create/append is for preventing clients to send out data before the redirect. 
This issue is addressed by the “Expect: 100-continue” header in HTTP/1.1; see RFC 2616, Section 8.2.3. 
Unfortunately, there are software library bugs(e.g. Jetty 6 HTTP server and Java 6 HTTP client), 
which do not correctly implement “Expect: 100-continue”. 
The two-step create/append is a temporary workaround for the software library bugs.

RFC 2616, Section 8.2.3.

The file was created when I tested not s ending any data.
The request must include header 'Content-Type: application/octet-stream'.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions