Accessing the Internet from PowerShell: Get-WebPage Part1

So last time we looked at accessing the web from PowerShell, we looked a simple function which returned our external IP address, we then used that function to make a script to automatically alert us when our external IP address changed. This was a pretty simple function and script, but I want to spend some time discussing some of this further and build a full CMDLet.

In later posts I want to discuss how I believe you should design your PowerShell code, including CMDLets, functions, scripts and modules, but I want to touch on some of this today.

In the get-externalip function, we have simply created the .Net framework webclient object, set up some headers, and the called the downloadstring method. This is fine for this simple script, but it isn't great design in the long term. What we need is a CMDLet that downloads a URL from the internet (or maybe a local source) and then make use of that CMDLet. The CMDLet should enable us to have a powerful framework where we can easily leverage all of the webclient class functionality, but in a refined and controlled PowerShell manner.

We should also ensure that we provide all the help and documentation with our CMDLets, so anyone can use them. For this we will use the PowerShell Comment Based Help Syntax, something that is an over looked part of PowerShell. Seriously, Microsoft should be encouraging more use of this, every language has something like similar to this, it’s not something special in PowerShell, but if every PowerShell developer used the comment based help syntax, the world would be a much better place.

Let’s take a look at a CMDLet that will get the HTML representation of a page, this CMDLet will also allow us to set things like the proxy server, headers, credentials for the remote page and the user agent!

The CMDLet below is based upon my Infrastructure Saturday 2012 presentation on PowerShell.

Whoa, quite a bit here, around 100 lines of code! Most of which though is documentation. As you can see, we have the start of the function definition, then the comment based help syntax. This syntax is providing documentation for the get-help CMDLet, including descriptions, parameters, inputs, outputs and examples.

Next we have the CMDLetBinding, we are not specifying anything extra here, just telling PowerShell that this is a CMDLet.

After this we have parameters, there all look pretty simple, only one is mandatory, URL. I have also specified types for all of the parameters. Whilst the specification of types is option, I am not doing this wherever possible, it reduces errors and provides very useful input validation.

Now we have the body of the CMDLet. As expected, we are making a new instance of the WebClient class. Then we have some if statements. If these optional parameters have been specified, then we need to set the various parts of the WebClient object.

I hard set the encoding, it makes things simple an easy.

We then create an object to hold the result, and within a try {} catch{} block will call download string. If we do catch any errors, simply throw them to the calling code. We don’t particularly care about handling errors here, the calling code should; however its crucial to handle errors in a controlled manner.

Finally return the resulting page.

Next time we will look at some examples and some further discussion of the code.