I am retrieving the list of all .pdf files in a directory and I have a function to get the number of pages for one pdf.

//List of all PDF files
string[] filePaths = Directory.GetFiles(cboSource.Text, "*.pdf", SearchOption.AllDirectories);
MessageBox.Show(String.Join(Environment.NewLine, filePaths));

//Get the number of pages in a PDF file
public int GetNumberOfPdfPages(string fileName)
{
    using (StreamReader sr = new StreamReader(File.OpenRead(fileName)))
    {
        Regex regex = new Regex(@"/Type\s*/Page[^s]");
        MatchCollection matches = regex.Matches(sr.ReadToEnd());
        return matches.Count;
    }
}

Please ignore the MessageBox as I have just used it to see whether the values are correct.

Now, I want to get the name/path of that one PDF that has the least number of pages in the total collection in string[] filePaths.

Please help.

Regards

4 Answers

1
Mile On Best Solutions

You can get number of pages like:

PdfReader pdfReader = new PdfReader("<path>");
int numberOfPages = pdfReader.NumberOfPages;

add number of pages for every pdf to array, and than

array.Min();

or:

Dictionary<PdfReader , int> pdfs= new Dictionary<PdfReader , int>();

and than get that pdf by least number of pages

pdfs.MinBy(x=> x.Value).Key;
0
neeraj baheti On
static void Main(string[] args)
{
        string[] filePaths = Directory.GetFiles("{Directory_Path}", "*.pdf", SearchOption.AllDirectories);
        int noOfPages = 0;
        string filePath = "";
        for(int i= 0;i < filePaths.Length;i++)
        {
            int tmp = GetNumberOfPdfPages(filePaths[i]);
            if(i==0)
            {
                noOfPages = tmp;
                filePath = filePaths[i];
            }
            else
            {
                if(tmp > noOfPages)
                {
                    noOfPages = tmp;
                    filePath = filePaths[i];
                }
            }
        }
}
public static int GetNumberOfPdfPages(string fileName)
{
        PdfReader pdfReader = new PdfReader(fileName); // use iTextSharp for PdfReader class.
        int numberOfPages = pdfReader.NumberOfPages;
        return numberOfPages;
}
0
na th On

your function GetNumberOfPdfPages does not work. Find another way to count the number of pages. Ok. Lets assume it works, you can do following

        //List of all PDF files
        string[] filePaths = Directory.GetFiles(cboSource.Text, "*.pdf", SearchOption.AllDirectories);
        MessageBox.Show(String.Join(Environment.NewLine, filePaths));

        string finalFile = string.Empty;
        int pages = int.MaxValue;
        foreach(var file in filePaths)
        {
            int currentPages = GetNumberOfPdfPages(file);
            if(currentPages < pages )
            {
                finalFile = file;
            }
        }
0
Muhammad Aqib On

you must collect filename and page count by using a model like this:

public class PdfFileInfo
{
    public string Filename { get; set; }
    public int PageCount { get; set; }
}


private void GetPdfFiles(string folder)
{
    var pdfFileInfos = new List<PdfFileInfo>();

    var filePaths = Directory.GetFiles(folder, "*.pdf", SearchOption.AllDirectories);

    foreach (var filePath in filePaths)
    {
        pdfFileInfos.Add(new PdfFileInfo
        {
            Filename = filePath,
            PageCount = GetNumberOfPdfPages(filePath)
        });
    }

    pdfFileInfos = pdfFileInfos.OrderBy(x => x.PageCount).ToList();

    if (pdfFileInfos.Count > 1)
    {
        var result = pdfFileInfos[0];

        MessageBox.Show($"{result.Filename} has {result.PageCount} pages.");
    }
}