[feature-request] Environment vairables for Nginx client_body_buffer_size and subrequest_output_buffer_size for prebuilt tensorflow


*Concise Description:*
Use case: to deploy models using prebuilt Tensorflow images. These models would process large payloads, as expected for a SageMaker asynchronous endpoint. 

Issue: When you pull the prebuilt container of choice (example: 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.18-cpu) and locally run inference for a payload larger than 100 MB you would see the below error:

Input:
```
 time curl -v -X POST http://localhost:8080/invocations \
    -H "Content-Type: application/json" \
    -d @/tmp/large_payload.json
```

Output:
```
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /invocations HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 177237594
> Expect: 100-continue
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Server: nginx/1.26.3
< Date: Thu, 24 Jul 2025 09:52:00 GMT
< Content-Type: text/html
< Content-Length: 177
< Connection: close
< 
<html>
<head><title>500 Internal Server Error</title></head>
<body>
<center><h1>500 Internal Server Error</h1></center>
<hr><center>nginx/1.26.3</center>
</body>
</html>
* Closing connection 0

real    0m0.409s
user    0m0.169s
sys     0m0.153s
```

- Currently the parameters client_body_buffer_size and subrequest_output_buffer_size are set to 100m in nginx.conf.template limiting payloads to 100 and produce the above error. 

- There are no environment variables to change this
https://github.com/aws/deep-learning-containers/blob/master/tensorflow/inference/docker/build_artifacts/sagemaker/serve.py#L292-L303

- to work around this had to create a custom container, below is an example:
```
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.18-cpu

# Copy custom handler
COPY nginx.conf.template sagemaker/nginx.conf.template
```
- the nginx.conf.template increased the parameters client_body_buffer_size and subrequest_output_buffer_size from the original 100m to a larger size (eg. 200m as my payload was 177 MB) as seen below and resolved the error:

```
load_module modules/ngx_http_js_module.so;

worker_processes auto;
daemon off;
pid /tmp/nginx.pid;
error_log  /dev/stderr %NGINX_LOG_LEVEL%;

worker_rlimit_nofile 4096;

events {
  worker_connections 2048;
}

http {
  include /etc/nginx/mime.types;
  default_type application/json;
  access_log /dev/stdout combined;
  js_import tensorflowServing.js;

  proxy_read_timeout %PROXY_READ_TIMEOUT%;  

  upstream tfs_upstream {
    %TFS_UPSTREAM%;
  }

  upstream gunicorn_upstream {
    server unix:/tmp/gunicorn.sock fail_timeout=1;
  }

  server {
    listen %NGINX_HTTP_PORT% deferred;
    client_max_body_size 0;
    client_body_buffer_size 200m;      #originally 100m
    subrequest_output_buffer_size 200m;   #originally 100m

    set $tfs_version %TFS_VERSION%;
    set $default_tfs_model %TFS_DEFAULT_MODEL_NAME%;

    location /tfs {
        rewrite ^/tfs/(.*) /$1  break;
        proxy_redirect off;
        proxy_pass_request_headers off;
        proxy_set_header Content-Type 'application/json';
        proxy_set_header Accept 'application/json';
        proxy_pass http://tfs_upstream;
    }

    location /ping {
        %FORWARD_PING_REQUESTS%;
    }

    location /invocations {
        %FORWARD_INVOCATION_REQUESTS%;
    }

    location /models {
        proxy_pass http://gunicorn_upstream/models;
    }

    location / {
        return 404 '{"error": "Not Found"}';
    }

    keepalive_timeout 3;
  }
}
```

*DLC image/dockerfile:*
example: 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.18-cpu


*Describe the solution you'd like*
introduce environment variables to change client_body_buffer_size and subrequest_output_buffer_size instead of creating a custom container for to change this limit


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature-request] Environment vairables for Nginx client_body_buffer_size and subrequest_output_buffer_size for prebuilt tensorflow #5089

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature-request] Environment vairables for Nginx client_body_buffer_size and subrequest_output_buffer_size for prebuilt tensorflow #5089

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions