Select-String -Context and find email address in data

1.1k views Asked by At

I have a text file where variations of this data (the number after 'SVC' and the date before, along with the text body) will appear multiple times. I can capture the string of data, but once I do, I need to locate an email address inside that data. The email may appear in the context at any line 4 through 9. I can't seem to figure out how to isolate the data and set it as a variable so it can be captured.

Select-String $WLDir -pattern '(\d{2}:\d{2}) - (\d{2}:\d{2})(PMT[S|T]\d{8})' -Context 0,9 | ForEach-Object {
        $StartTime=[datetime]::ParseExact($_.Matches.Groups[1].Value,"HH:mm",$null)
        $EndTime=[datetime]::ParseExact($_.Matches.Groups[2].Value,"HH:mm",$null)
        $ElapsedTime = (NEW-TIMESPAN –Start $StartTime –End $EndTime).TotalHours
        $Email = Select-String $_. -pattern '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)'
    [PSCustomObject]@{
        SO = $_.Matches.Groups[3].Value
        Topic = $_.Context.PostContext[0]
        Status = $_.Context.PostContext[1]
        ElapsedHrs = $ElapsedTime
        Email = $Email
    }
} | Export-Csv $ExportCsv -NoTypeInformation

My example file is like this:

  09:45 - 10:15SVC1234567 | Sev8 |437257 | COMPANY | Due: 12/28/2016
  WORK TITLE
  - - Preferred Customer (Y/N): Y Phone: 000-000-0000 ANY Hardware (Y/N): N 
  DATA on file (Y/N/NA): Y Contact: Person Name Full Address: 1234 PANTS 
  XING, RM/STE 100,NEWARK, NJ, 00000 - Hours: 8-5 Issue: Install admin 
  and others Fax Number: NA (required for all cases sent to LOCATION or 
  LOCATION_EXCPT Provider Groups) E-Mail address: [email protected] the 
  customer speak English? yes Escalation Approved By (Name/ID): Guy 
  aljdfs ITEM Product: PRODUCTNAME Group:THIS ONE Include 
  detailed notes below, including reason for severity: SCHEDULED WORK 
  ------------------------------ NOTES: -Cx requesting a tech on site -Cx 
  wants to install WS and wants to be assisted in other concerns

I've tried capturing the email in the context with $Email = Select-String $_. -pattern '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)', $Email = Select-String $_.WLDir -pattern '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)' and $Email = Select-String $_.Context -pattern '(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)' but can't figure out how to call back the context to search it for the email address. There's also a good chance I'm doing this all wrong. Does anyone know how I can capture this and set it as a variable?

2

There are 2 answers

0
Nate On BEST ANSWER

Because I never found an accurate way to capture this information, I decided to capture all lines 0-9 in post context into Status. On the Excel sheet, I'm using the calculation from this page of =IF(O6="","",TRIM(RIGHT(SUBSTITUTE(LEFT(O6,FIND(" ",O6&" ",FIND("@",O6))-1)," ",REPT(" ",LEN(O6))),LEN(O6)))) to pull the data from column "O" to column "Q" where the email belongs. I appreciate everyone's assistance.

7
skrubber On

Try this:

$content = gc -path $path -Raw | Out-String
$regex1 = [regex]"\w+@\w+.\w+"
$regex2=[regex]"(?ms)(\d{2}:\d{2}) - (\d{2}:\d{2})(\D+)(\d+)(.*)"
$content | Select-String -pattern $regex2 | %{
$startTime = [datetime]::ParseExact(($regex2.Matches($content) | %{$_.Groups[1].Value}),"HH:mm",$null) 
$endTime = [datetime]::ParseExact(($regex2.Matches($content) | %{$_.Groups[2].Value}),"HH:mm",$null) 
$elapsedTime = (NEW-TIMESPAN –Start $startTime –End $endTime).TotalHours
$code = "PMT" + ($_.Matches.Groups[4].value) 
$remainingString = $_.Matches.Groups[5].Value
$topic = $remainingString.split("`n")[1] 
$status = $remainingString.split("`n")[2] 
$email = $regex1.Matches($remainingString).Value    

[PSCustomObject]@{
        SO = $code
        Topic = $topic
        Status = $status
        ElapsedHrs = $elapsedTime
        Email = $email
    }
} | Export-Csv "res.csv" -NoTypeInformation