How to filter files when only part of a file name is known?

86 views Asked by At

There is more to this question.

We have Inventor files and all other kinds of files in EngDesignLibrary. There are .idw that also exist in .pdf format.
The .pdf file can have exactly the same name as the .idw file name or not. This type of file will definitely have the FileName of the .idw but it can have characters added to the left or the right of it. And characters could be literally any type, alphanumeric, parenthesis, hyphen, comma, special characters, there isn't any pattern.

The end result needs to have the following kind of files while keeping the directory structure:

  1. FileName.idw is not exactly like FileName.pdf. Basically, when FileName.idw -like "*FileName*.pdf" I want these.
  2. All the other kind of files but excluding the following: (2.1) Where extensions are ('.ipt', '.iam', '.ipn', '.idw') (2.2) Where .idw filname is exactly the same as the .pdf. Basically, where FileName.idw -like FileName.pdf should be excluded.

To the above result I get the files in two parts. Part1 is where .idw file name is not exactly the same as .pdf' and .idw is exactly same as .pdf Part2 is where I exclude the files that I don't need plus the files from Part1

The issue is that in Part2 I am also getting the .pdf equivalents of .idw

Script

$sourceFolder = "F:\Departments\EngDesignLib\"
$fileExtensions = @('.ipt', '.iam', '.ipn', '.idw')
$destination = "\\SharedFolder\Eng_3\All_Pdfs"
$csvFilePath = "C:\Users\Desktop\CSVOutput\MoveTo03\MoveTo03_1.csv"
Get-ChildItem -Path $sourceFolder -File -Recurse | ForEach-Object{
    $file = $_
    $extension = $_.Extension 
    $baseName = $file.Basename
    $filename = $file. Name
    #Part 1
    If ($_.Extension -in '.idw'){
    #change the extension of the file in the path itself
    $pdfFileName = "*$basename*.pdf"
    $pdfFileNameEaxctMatch = [System.IO.Path]::ChangeExtension($file.BaseName, "pdf")
    $pdfFilePath = $file.fullname -Replace ($file.Name, $pdfFileName)
    $pdfFilePathExactMatch = $file.fullname -Replace ($file.Name, $pdfFileNameEaxctMatch)
    $csvData = [PSCustomObject]@{
        FileName = $file.Name
        FilePath = $file.FullName
        IsInventorFile = 1
        FileNamePdfOFIDW = If (Test-Path $pdfFilePath){ $pdfFileName} Else{'PDF Not Found'}
        PdfFilePathOfPDFIDW =  If (Test-Path $pdfFilePath){ $pdfFilePath} Else{'PDF Not Found'}
    }
    $csvData| Export-Csv -Path $csvFilePath -NoTypeInformation -Force -Append
    }
    #Part 2
    If ( !( $_.Extension -in $fileExtensions ) -and ($fileName -notLike  ($pdfFileName -or  $pdfFileNameEaxctMatch))){
    $csvData = [PSCustomObject]@{
        FileName = $file.Name
        FilePath = $file.FullName
        IsInventorFile = 0
        FileNamePdf = 'Not an Iventor File'
        PdfFilePath = 'Not an Iventor File'
    }
    $csvData| Export-Csv -Path $csvFilePath -NoTypeInformation -Force -Append
    } 
}

I have tried most of he conditional operators -like, -notLike, -contains, -notcontains but it doesn't work. Also, not entirely sure how this line ` $pdfFileName = "$basename.pdf"' works. Because the output shows the file name with astrix but Test-path is true when there are no files with astrix. Also it returns files where .idw file name is exactly the same as .pdf which messes up the files I get in Part1

Output example for this particular case:

F:\Departments\Engineering\EngDesignLib\089\089705.000\089705-01.idw F:\Departments\Engineering\EngDesignLib\089\089705.000*089705-01*.pdf

How do I get files where .idw file name is not exactly same as the .pdf name? And make the filtering-out condition in Part2 work.

1

There are 1 answers

0
jeremywat On

Considering your 2 criteria:

  1. The the extension is .idw and the exact filename with extension .pdf does not exist.
  2. The extension is not of the following:
    • .ipt
    • .iam
    • .ipn

Try this:

Get-ChildItem -Path $sourceFolder -File -Recurse | Where-Object {
    $_.Extension -notin '.ipt', '.iam', '.ipn' -and
    -not (
        $_.Extension -eq '.idw' -and 
        (Test-Path -LiteralPath ($_.FullName -replace '\.idw$', '.pdf'))
    )
}

Explanation:

$_.Extension -notin '.ipt', '.iam', '.ipn'

Tests that the extension is not one of the listed extensions.

-not (
    $_.Extension -eq '.idw' -and 
    (Test-Path -LiteralPath ($_.FullName -replace '\.idw$', '.pdf'))
)

I'll break this down into the 4 things it's doing:

$_.Extension -eq '.idw'

Tests if the extension equals .idw.

$_.FullName -replace '\.idw$', '.pdf'

Gets the full path of the file, replacing the file extension .idw with .pdf.

Test-Path -LiteralPath ($_.FullName -replace '\.idw$', '.pdf')

Tests whether or not the replaced filepath exists (e.g.: if the same file ending in .pdf exists).

-not ( ... )

Inverses whatever the $true/$false result of the tests inside the parentheses. So, if the file does have the extension .idw and the same file ending in .pdf exists, this would normally return $true so the -not flips that to a $false, causing the whole Where-Object filter to be $false, so it will not return this file.