I want to preserve a website as object offline. I am using Powershell 5.1.19041.546 on Windows10
#online analysis (does work)
$website = Invoke-WebRequest https://www.w3schools.com/html/html_tables.asp
$website | gm
#I get an Microsoft.PowerShell.Commands.HtmlWebResponseObject object
#next I use $website in this function (I call it Get-WebRequestTable) that expects a [Microsoft.PowerShell.Commands.HtmlWebResponseObject] $WebRequest, input object https://www.leeholmes.com/blog/2015/01/05/extracting-tables-from-powershells-invoke-webrequest/
#offline analysis saving website locally and import it with get-content (does not work)
#saving the website locally
Invoke-WebRequest -Uri https://www.w3schools.com/html/html_tables.asp -OutFile C:\temp\website
#writing the website back to a variable
$offlinedata = Get-Content C:\temp\website
#I get a string object
$offlinedata | gm
#String can not be used in function :Get-WebRequestTable : Cannot process argument transformation on parameter 'WebRequest'. Cannot convert the "System.Object[]" value of type "System.Object[]" to type "Microsoft.PowerShell.Commands.HtmlWebResponseObject".
Get-WebRequestTable -WebRequest $offlinedata
#offline analysis saving website locally as XML (does not work)
Invoke-WebRequest -Uri https://www.w3schools.com/html/html_tables.asp | Export-Clixml C:\temp\website.xml
this runs very long and I get the following XML (shorted)
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04">
[...] <S>System.__ComObject</S>
<S>System.__ComObject</S>
It seems to create an endless loop at this point
<S>System.__ComObject</S>
#converting it to json to store it locally (does not work)
$website = Invoke-WebRequest -Uri https://www.w3schools.com/html/html_tables.asp
$website | ConvertTo-Json
I get
ConvertTo-Json : An item with the same key has already been added.
Does anyone know a way how to store a website locally and later restore the [Microsoft.PowerShell.Commands.HtmlWebResponseObject] object for further processing?
This code imports local html code to an "HtmlWebResponseObject" object
Kudos to Prateek Singh https://ridicurious.com/2017/01/24/powershell-tip-parsing-html-from-a-local-file-or-a-string/
I changed the code of lee holmes a bit so that it can handle both object types. [Microsoft.PowerShell.Commands.HtmlWebResponseObject] in case you use
invoke-webrequest
or [HTMLDocumentClass] in case you useconvert-localhtml
https://www.leeholmes.com/blog/2015/01/05/extracting-tables-from-powershells-invoke-webrequest/
Kudos to him for his great table extraction code