Accessing the Internet from PowerShell: Download Files Part 1


Today we are going to download files from the internet, and other few places as a continuation of the series on the WebClient class in the .Net framework. The WebClient class has a method called DownloadFile, which as the name says, allows us to download files from many sources including HTTP, HTTPS, FTP and \\server\share locations. The aim will be by the end of this to have a CMDLet allowing us to download files from these various locations.

So, how does the DownloadFile method work? Here is an example:

The only issue with DownloadFile is that we need to specify a filename as the destination. This is something we will need to work around/with.

So if we wanted design a CMDLet around this, what should we consider? Firstly, let’s just ignore headers and encoding types and all that junk and assume we will handle that. What else is there? Well for a CMDLet that downloads files, we probably want to be able to use it in the pipeline, that is, I might want to pipe an array or list of URLs to that CMDLet and have it download them all. Next I don't want to have to specify the destination file name and folder all the time, but that doesn't mean I might not want to specify one or the other down the track. Finally, I hate downloading a file twice, or accidently overwriting a destination file. I want the CMDLet to not overwrite files unless I tell it to, and if it is, warn me about it.

So here are the requirements:

  • ·         We want to be able to pipe URLs to the CMDLet so they can be downloaded
  • ·         Destination filename optional. CMDLet to determine filename if none specified
  • ·         Destination folder/directory optional. We might want to send a file to a different directory than the current working directory.
  • ·         By default, do not overwrite files. If we tell the CMDLet to overwrite files, it should warn us

Well supporting the pipeline when writing a CMDLet is easy, we simply use the begin{} process {} end {} syntax. If we do things in a clever way, we should be able to reduce the amount of work we are doing for a large list of downloads.

The second requirement, optionally specifying the filename to download the file to...that can be a little more difficult! The trick is to have two parameters, filename and directory. Depending on what is or isn’t specified, we can do 4 different things.

  1. 1.       If both are specified -> join the two together
  2. 2.       If no filename or destination directory is specified -> the destination is the current directory (converted from .) joined with the "leaf" part of the URL
  3. 3.       If no filename is specified, but a directory is -> the destination is the specified directory joined with the "leaf" part of the URL
  4. 4.       If filename is specified but a directory is not -> The destination  is the current directory (converted from .) joined with the specified filename

The final requirement, not overwriting files by default, is also pretty simple, we can test to see if a file exist, if it doesn't then we are ok to download the file to the destination, otherwise we need a switch parameter, something like -clobber, to tell the CMDLet if its ok to overwrite the file. If it’s not ok, we will throw an error, if it is, we will use the write-warning to make sure we know that a file is being overwritten.

We will look at the cmdlet, next time.