Issue with Chinese Text Display in PDF Upload

41 views Asked by At

I am encountering an issue when uploading a PDF that contains Chinese text. Upon attempting to read the PDF content, the Chinese text is not displayed correctly, and instead, I am getting newline characters (\n). I am using pdf-lib and pdf-parse.

import pdfParse from "pdf-parse";

// Function to parse a PDF file
async function parsePDF(filePath) {
  try {
    // Read the PDF file
    const dataBuffer = await fs.promises.readFile(filePath);

    // Convert the buffer to text using pdf-parse
    const data = await pdfParse(dataBuffer);

    // Access the parsed text, which may contain Chinese characters
    const chineseText = data.text;

    // Now you can work with the Chinese text as needed
    console.log(chineseText);
  } catch (error) {
    console.error("Error parsing PDF:", error);
  }
}

// Example usage
const filePath = "path/to/your/pdf/file.pdf";
parsePDF(filePath);

this is what i get when i read the text from the file.

'\n\nFBA\n\nFBA: *****\n\n\n\n\n\n\n\n\n\n\nFBA17PG60THDU000001\nSingle SKU\n\n\n\n\n\nFBA\n\nFBA: EVEO LLC\n\n\n\n\n\n\n\n\n\n\nFBA17PG60THDU000002\nSingle SKU\n\n\n\n\n\nFBA\n\n

PDF File

maybe someone know how to read ?

0

There are 0 answers