Create Grouped Averages from CSV data using Powershell

37 views Asked by At

I have a CSV file similar to this:

Name,Dates,Sortino,RoMDD,PctWins
AAL,2017-06-01 - 2018-06-01,-0.47,-0.26,66.67%
AAL,2017-09-01 - 2018-06-01,0.03,0.02,66.67%
AAL,2017-12-01 - 2018-06-01,-1.05,-0.69,50.00%
AAL,2018-03-01 - 2018-06-01,-2.99,-2.32,33.33%
AAPL,2017-06-01 - 2018-06-01,2.01,1.79,66.67%
AAPL,2017-09-01 - 2018-06-01,1.89,1.68,66.67%
AAPL,2017-12-01 - 2018-06-01,1.66,1.64,66.67%
AAPL,2018-03-01 - 2018-06-01,3.1,3.86,66.67%
ADBE,2017-06-01 - 2018-06-01,3.88,6.09,75.00%
ADBE,2017-09-01 - 2018-06-01,4.16,7.03,77.78%
ADBE,2017-12-01 - 2018-06-01,3.92,8.34,83.33%
ADBE,2018-03-01 - 2018-06-01,4.58,10.29,100.00%

The Dates column is irrelevant and just added for context.

I need to create averages grouped by name, equivalent to an Excel pivot table with Rows: Name and Values: Average of [RoMDD,PctWins,Sortino].

Those results then need to be filtered by Top 20 Sortino, then Top 20 RoMDD, and Top 20 PctWins, all within the previous level (each filter builds on the filtering of the previous level). But that is perhaps a subject for a 2nd post. Otherwise I'll Google (add rank to a list of numbers?)

Here is what I've written so far:

$csv = import-csv -path C:\Temp\Example.CSV | Select-Object -Property Name,Sortino,RoMDD,PctWins | ForEach {$_.PctWins=$_.PctWins.Replace("%","");$_}
$collection = $csv | group-object -property Name
$avg = $collection | Foreach {
    $avg = $_.group|measure -Property Sortino -Average | select -expand average
    add-member -inputobject $_.group[0] -notepropertyname AvgSortino -notepropertyvalue $avg -PassThru
    $avg = $_.group|measure -Property RoMDD -Average | select -expand average
    add-member -inputobject $_.group[0] -notepropertyname AvgRoMDD -notepropertyvalue $avg -PassThru
    $avg = $_.group|measure -Property PctWins -Average | select -expand average
    add-member -inputobject $_.group[0] -notepropertyname AvgPctWins -notepropertyvalue $avg -PassThru
} | Select Name,AvgSortino,AvgRoMDD,AvgPctWins

This is close, but adds each average to a new object, instead of additional columns in the one object.

Would creating a new PSCustomObject be a better approach here? Or using Datatable and Linq? I don't know Linq but I know SQL quite well.

The final output will be written back to a new CSV file including new (Avg) column headers.

2

There are 2 answers

0
Olaf On

I'm not completely sure if I got what you need ... but:

$csv | 
    Group-Object -Property Name -PipelineVariable Name |
        ForEach-Object {
            $_.group | Measure-Object -Property Sortino,RoMDD,PctWins -Average 
        } |
            Select-Object -Property @{Name = 'Name'; Expression = {$Name.Name}},Property, Average

This returns the output:

Name Property Average
---- -------- -------
AAL  Sortino    -1,12
AAL  RoMDD    -0,8125
AAL  PctWins  54,1675
AAPL Sortino    2,165
AAPL RoMDD     2,2425
AAPL PctWins    66,67
ADBE Sortino    4,135
ADBE RoMDD     7,9375
ADBE PctWins  84,0275
0
sirtao On

Making a new object is the best way

EDIT: improved solution, thanks @Olaf

$CollectionWithAverages = Import-Csv -Path $CsvPath | 
    Group-Object -Property Name | 
    Select-Object -Property Name,
        @{Name = 'AvgSortino'; Expression = { $_.Group | Measure-Object -Property Sortino -Average | Select-Object -ExpandProperty Average } },
        @{Name = 'AvgRoMDD'; Expression = { $_.Group | Measure-Object -Property RoMDD -Average | Select-Object -ExpandProperty Average } },
        @{Name = 'AvgPctWins'; Expression = { $_.Group | Measure-Object -Property { $_.PctWins.Trimend('%') } -Average | Select-Object -ExpandProperty Average } }


$CollectionWithAverages

returns

Name AvgSortino AvgRoMDD AvgPctWins
---- ---------- -------- ----------
AAL       -1,12    -0,81      54,17
AAPL       2,17     2,24      66,67
ADBE       4,13     7,94      84,03

OLD SOLUTION(still a bit more readable tho'):

# I also put the replace directly in the Select-Object as calculated property, should be more efficient with large CSVs
$csv = Import-Csv -Path $CsvPath | Select-Object -Property Name, Sortino, RoMDD, @{Name = 'PctWins'; Expression = { $_.PctWins.Replace('%', '') } }

$CollectionGrouped = $csv | Group-Object -Property Name

$CollectionWithAverages = $CollectionGrouped| ForEach-Object {
    [PSCustomObject]@{
        Name       = $_.Name
        AvgSortino = $_.Group | Measure-Object -Property Sortino -Average | Select-Object -ExpandProperty Average
        AvgRoMDD   = $_.Group | Measure-Object -Property RoMDD -Average | Select-Object -ExpandProperty Average
        AvgPctWins = $_.Group | Measure-Object -Property PctWins -Average | Select-Object -ExpandProperty Average
    }
}

$CollectionWithAverages

returns

Name AvgSortino AvgRoMDD AvgPctWins
---- ---------- -------- ----------
AAL       -1,12    -0,81      54,17
AAPL       2,17     2,24      66,67
ADBE       4,13     7,94      84,03

You can actually one-line this, but I thought to keep it more readable.