These days, I spend much of my time working with spreadsheets or manipulating data that has come from systems in the form of CSV files.
This week I ran into an interesting issue where I ended up using some PowerShell tricks to change the structure of the data into a more useful format.
I found myself with a CSV file with the columns: Sender
, Recipient
, Recipient2
, Recipient3
… all the way to Recipient24
. The file looked like this:
Every row had a value for Sender
and Recipient
, however only some rows had values for Recipient2
and onwards.
What I needed was a list of Sender
and Recipient
, removing the other Recipient*
columns and making them new entries. What I needed was a list that looked like this one:
I am not an Excel guru, but I am handy with PowerShell, so I decided that I would clean this up with PowerShell.
My original thought was this process:
- Import the CSV file using
import-csv
- Have a
foreach
loop that looked at each entry in the CSV file, and created new entries as follows.
My code looked like this:
$OriginalCSV = import-csv 'original.csv'
$NewCSV = @()
Foreach ($item in $OriginalCSV) {
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient}
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient2}
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient3}
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient4}
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient5}
# ...
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient23}
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient24}
}
$NewCSV | Export-Csv -Path 'new.csv'
This code does what I needed, but it is far from perfect. I have copy-pasted the same code; it does not look very re-usable.
How could I make this code more reusable?
With PowerShell, I can dynamically build a reference to an object’s methods and members as so:
$Value=2
$item."Recipient$Value"
By adding a for
loop, I can now pass throw each of the Recipient* columns without resorting to copy-pasting code. This allowed me to simplify my code down to:
$OriginalCSV = import-csv 'original.csv'
$NewCSV = @()
Foreach ($item in $OriginalCSV) {
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item.Recipient}
for ($ii = 2; $ii -le 24; $ii++) {
if ($item."Recipient$ii" -ne '') {
$NewCSV = $NewCSV + [PSCustomObject]@{Sender = $item.Sender; Recipient=$item."Recipient$ii"}
}
}
}
$NewCSV | Export-Csv -Path 'new.csv'
Till next time,
Kieran