"Header content contains invalid characters" error when piping multipart upload part into a new request

33.1k views Asked by At

My express server receives file uploads from browsers. The uploads are transferred as multipart/form-data requests; I use multiparty to parse the incoming entity body.

Multiparty allows you to get a part (roughly, a single form field like an <input type="file">) as a readable stream. I do not want to process or store the uploaded file(s) on my web server, so I just pipe the uploaded file part into a request made to another service (using the request module).

app.post('/upload', function(req, res) {
    var form = new multiparty.Form();

    form.on('part', function(part) {

        var serviceRequest = request({
            method: 'POST',
            url: 'http://other-service/process-file',
            headers: {
                'Content-Type': 'application/octet-stream'
            }
        }, function(err, svcres, body) {
            // handle response
        });

        part.pipe(serviceRequest);
    });

    form.parse(req);
});

This works correctly most of the time. node automatically applies chunked transfer encoding, and as the browser uploads file bytes, they are correctly sent to the backend service as a raw entity body (without the multipart formatting), which ultimately gets the complete file and returns successfully.

However, sometimes the request fails and my callback gets called with this err:

TypeError: The header content contains invalid characters 
    at ClientRequest.OutgoingMessage.setHeader (_http_outgoing.js:360:11) 
    at new ClientRequest (_http_client.js:85:14) 
    at Object.exports.request (http.js:31:10) 
    at Object.exports.request (https.js:199:15) 
    at Request.start (/app/node_modules/request/request.js:744:32) 
    at Request.write (/app/node_modules/request/request.js:1421:10) 
    at PassThrough.ondata (_stream_readable.js:555:20) 
    at emitOne (events.js:96:13) 
    at PassThrough.emit (events.js:188:7) 
    at PassThrough.Readable.read (_stream_readable.js:381:10) 
    at flow (_stream_readable.js:761:34) 
    at resume_ (_stream_readable.js:743:3) 
    at _combinedTickCallback (internal/process/next_tick.js:80:11) 
    at process._tickDomainCallback (internal/process/next_tick.js:128:9) 

I'm unable to explain where that error is coming from since I only set the Content-Type header and the stack does not contain any of my code.

Why do my uploads occasionally fail?

4

There are 4 answers

0
josh3736 On BEST ANSWER

That TypeError gets thrown by node when making an outgoing HTTP request if there is any string in the request headers option object contains a character outside the basic ASCII range.

In this case, it appears that the Content-Disposition header is getting set on the request even though it is never specified in the request options. Since that header contains the uploaded filename, this can result in the request failing if the filename contains non-ASCII characters. ie:

POST /upload HTTP/1.1
Host: public-server
Content-Type: multipart/form-data; boundary=--ex
Content-Length: [bytes]

----ex
Content-Disposition: form-data; name="file"; filename="totally legit .pdf"
Content-Type: application/pdf

[body bytes...]
----ex--

The request to other-service/process-file then fails because multiparty stores the part headers on the part object, which is also a readable stream representing the part body. When you pipe() the part into serviceRequest, the request module looks to see if the piped stream has a headers property, and if it does, copies them to the outgoing request headers.

This results in the outgoing request that would look like:

POST /process-file HTTP/1.1
Host: other-service
Content-Type: application/octet-stream
Content-Disposition: form-data; name="file"; filename="totally legit .pdf"
Content-Length: [bytes]

[body bytes...]

...except that node sees the non-ASCII character in the Content-Disposition header and throws. The thrown error is caught by request and passed to the request callback function as err.

This behavior can be avoided by removing the part headers before piping it into the request.

delete part.headers;
part.pipe(serviceRequest);
0
Nicolas Bodin On

You can use encodeURI server side and decodeURI client side.

Example with a csv file, using an Express server and a JavaScript client.

server

router.get('urlToGetYourFile', async (req, res, next) => {
  try {
    const filename = await functionToGetYourFilename();
    const file = await functionToGetYourFile();
    res
      .status(200)
      .header({
        'content-Type': 'text/csv'
        'content-disposition': 'attachment;filename=' + encodeURI(filename)
      })
      .send(file.toString('binary'));
  } catch(error) {
    return res.status(500).send({ error });
  }
}

client

const getFile = async () => {
  try {
    const response = await axios.get('urlToGetYourFile');
    const filename = decodeURI(response.headers['content-disposition'].split('filename=')[1]);
    const type = { type: 'text/csv' };
    const blob = new Blob([response.data], type);
    return new File([blob], filename, type);
  } catch(error) {
    throw error;
  }
}
1
Wliontb On

As like as @arrow cmt before, using encodeURI(filename) on your Content-disposition header. In client, you using decodeURI method to decode.

2
Aikon Mogwai On

This example shows how to send file as an attachment with national symbols in the filename.

const http = require('http');
const fs = require('fs');
const contentDisposition = require('content-disposition');
...

// req, res - http request and response
let filename='totally legit .pdf';
let filepath = 'D:/temp/' + filename;               

res.writeHead(200, {
    'Content-Disposition': contentDisposition(filename), // Mask non-ANSI chars
    'Content-Transfer-Encoding': 'binary',
    'Content-Type': 'application/octet-stream'
});

var readStream = fs.createReadStream(filepath);
readStream.pipe(res);
readStream.on('error', (err) => ...);