PowerShell Web Enumeration – Get-WebsiteInfo

I was tasked to do a penetration test of a /16 network and was told to focus on the web applications. After a discovery scan returned ~15,000 hosts, I really just wanted to focus on systems that were potentially running some sort of web services. A neat little trick in nmap is to just scan by a particular service, so I ran the command nmap -p http* -iL discover_hosts.txt, which returned ~4,000 machines that were running at least one service on a web port. Nearly 3,000 of the hosts were listening on either port 80 or 443. I randomly browsed to a few of these sites and found a lot of printers and forgotten Apache test pages, but I really needed something to help organize and help prioritize these web targets, so I coded a tool to make this easier.

The cmdlet I wrote, Get-WebsiteInfo, is a multi-threaded web scanner that takes a URL or list of URLs and sends a web request to each one. If the target responds, Get-WebsiteInfo records the redirect (if redirected), the server header (if present in the response), and the title of the website (if able to parse). You can then pipe the output to Export-Csv for a nice and organized report. The code is located in my Github (if this link doesn’t work I probably just moved it…keep looking), and the remainder of this post will discuss portions of the code.

The first thing the cmdlet does is do some basic URL processing. It expects the URLs to be formatted as http(s)://address:port, such as https://laconicwolf.com:443, but if they do not appear that way, the Process-Urls function will attempt to turn them into a format the script can use. Basically, this function checks whether or not the string starts with http (which includes https), and if it does then it leaves it alone. If it doesn’t start with http, it goes through more checks, including checking for a port indicator (:). It will extract the number on the right of the ‘:’, which it takes and compares to a list of common http and https ports. If the port number isn’t on either list it uses http and https.


Function Process-Urls {
        <# .DESCRIPTION Checks an array of URLs and transforms them into the following format: http(s)://addr:port #>
        Param(
            [Parameter(Mandatory = $True)]
            [array]$URLs
        )

        $HttpPortList = @('80', '280', '81', '591', '593', '2080', '2480', '3080', 
                  '4080', '4567', '5080', '5104', '5800', '6080',
                  '7001', '7080', '7777', '8000', '8008', '8042', '8080',
                  '8081', '8082', '8088', '8180', '8222', '8280', '8281',
                  '8530', '8887', '9000', '9080', '9090', '16080')                    
        $HttpsPortList = @('832', '981', '1311', '7002', '7021', '7023', '7025',
                   '7777', '8333', '8531', '8888')

        $ProcessedUrls = @()
        
        foreach ($Url in $URLs) {
            # checks for http(s) and * in domain name
            if ($Url.startswith('http')) {
                if ($Url -match '\*') {
                    $Url = $Url -replace '[*].',''
                }
                $ProcessedUrls += $Url
                continue
            }
            # checks for a port number and compares to port lists
            if ($Url -match ':') {
                $Port = $Url.split(':')[-1]
                if ($Port -in $HttpPortList) {
                    $ProcessedUrls += "http://$Url"
                    continue
                }
                elseif ($Port -in $HttpsPortList -or $Port.endswith('43')) {

                    $ProcessedUrls += "https://$Url"
                    continue
                }
                else {
                    $ProcessedUrls += "http://$Url"
                    $ProcessedUrls += "https://$Url"
                    continue
                }
            }
            # check for * in domain name. Example - *.laconicwolf.com
            if ($Url -match '\*') {
                $Url = $Url -replace '[*].',''
                $ProcessedUrls += "http://$Url"
                $ProcessedUrls += "https://$Url"
                continue
            }
        }
        return $ProcessedUrls
    }

 

After the processing, a few things are required to prevent errors and make things work. Many of the web services I was scanning used self-signed certificates for HTTPS, or required a minimum (or maximum cipher suite), so I needed the script to ignore certificate errors. Unfortunately, the cmdlet I used for webrequests, Invoke-WebRequest (IWR), doesn’t include an ‘IgnoreCertifcateErrors’ option. Fortunately, stack overflow had some code snippets that worked.


    # ignore HTTPS certificate warnings
    # https://stackoverflow.com/questions/11696944/powershell-v3-invoke-webrequest-https-error
add-type @"
    using System.Net;
    using System.Security.Cryptography.X509Certificates;
    public class TrustAllCertsPolicy : ICertificatePolicy {
        public bool CheckValidationResult(
            ServicePoint srvPoint, X509Certificate certificate,
            WebRequest request, int certificateProblem) {
            return true;
        }
    }
"@
    [System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy

    # To prevent 'Could not create SSL/TLS secure channel.' errors
    # https://stackoverflow.com/questions/41618766/powershell-invoke-webrequest-fails-with-ssl-tls-secure-channel
    [Net.ServicePointManager]::SecurityProtocol = "Tls12, Tls11, Tls, Ssl3"

The Add-Type cmdlet lets you define a .NET class within the ‘here string’ (@” “@), which in this example is the TrustAllCertsPolicy. Then the class is instantiated into the PowerShell session. The accepted cipher suites are manually set to accept TLS 1.2 – SSLv3.

I also felt the need to use a different User-Agent for each request, so I created a simple function to choose one of five user agents:


    Function Get-RandomAgent {

        $num = Get-Random -Minimum 1 -Maximum 5
        if($num -eq 1) {
            $ua = [Microsoft.PowerShell.Commands.PSUserAgent]::Chrome
        } 
        elseif($num -eq 2) {
            $ua = [Microsoft.PowerShell.Commands.PSUserAgent]::FireFox
        }
        elseif($num -eq 3) {
            $ua = [Microsoft.PowerShell.Commands.PSUserAgent]::InternetExplorer
        }
        elseif($num -eq 4) {
            $ua = [Microsoft.PowerShell.Commands.PSUserAgent]::Opera
        }
        elseif($num -eq 5) {
            $ua = [Microsoft.PowerShell.Commands.PSUserAgent]::Safari
        }
        return $ua
    }

This function uses the Get-Random cmdlet to choose a random number between 1 and 5, and returns a different User-Agent string depending on the number.

IWR will return an error if it receives a response other that 2xx or 3xx, so a Try/Catch block must be used to capture the returned data in that case.


    # send request to url
    if ($Proxy) {
        $Response = Try {
            Invoke-WebRequest -Uri $URL -UserAgent $UserAgent -Method Get -Proxy $Proxy -TimeoutSec 2 -UseBasicParsing
        }
        Catch {
            $_.Exception.Response
        }      
    }
    else {
        $Response = Try {
            Invoke-WebRequest -Uri $URL -UserAgent $UserAgent -Method Get -TimeoutSec 2 -UseBasicParsing
        }
        Catch {
            $_.Exception.Response
        }   
    }

The switch -UseBasicParsing must be used, or you risk IWR hanging indefinitely. Some characters in web pages will cause this when IWR attempts to parse the html. This makes you lose the parsedHtml property, but it’s better than having the script hang.

When running the script, I would occasionally get a popup regarding permission to store cookies on the computer. Sometimes this would make the script hang, so I again consulted stackoverflow and added a registry value that prevents this popup by accepting all the cookies. The value is deleted at the end of the script.


# accept all cookies to avoid popups
# https://stackoverflow.com/questions/31720519/windows-10-powershell-invoke-webrequest-windows-security-warning
$msg = reg add "HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings\Zones\3" /t REG_DWORD /v 1A10 /f /d 0

The $Response variable is then checked to see whether in was an exception or a 2xx or 3xx response. Then it is parsed accordingly for the redirect url, title, and server header.


    # indicates a 2xx or 3xx response
    if ($Response.GetType().name -eq "BasicHtmlWebResponseObject") {

        # examine response to compare current url and requested url
        if ($Response.BaseResponse.ResponseUri.OriginalString.trim('/') -ne $URL.trim('/')) {
            $RedirectedUrl = $Response.BaseResponse.ResponseUri.OriginalString
        }
        else {
            $RedirectedUrl = ""
        }

        # finds title if available
        $Title = [regex]::match($Response.Content,'(?i)<title>(.*?)</title>').Groups[1].Value
        if (-not $Title) {
            $Title = ""
        }

        # examines response headers and extracts the server value if available
        if ($Response.BaseResponse.Server) {
            $Server = $Response.BaseResponse.Server
        }
        else {
            $Server = ""
        }
    }

    # indicates a 4xx or 5xx response
    elseif ($Response.GetType().name -eq "HttpWebResponse") {
        # examine response to compare current url and requested url
        if ($Response.ResponseUri.OriginalString.trim('/') -ne $URL.trim('/')) {
            $RedirectedUrl = $Response.ResponseUri.OriginalString
        }
        else {
            $RedirectedUrl = ""
        }

        # extracts the html 
        $Result = $Response.GetResponseStream()
        $Reader = New-Object System.IO.StreamReader($Result)
        $Reader.BaseStream.Position = 0
        $Reader.DiscardBufferedData()
        $ResponseBody = $Reader.ReadToEnd();

        # finds title if available
        $Title = [regex]::match($ResponseBody,'(?i)').Groups[1].Value
        if (-not $Title) {
            $Title = ""
        }

        # examines response headers and extracts the server value if available
        if ($Response.Server) {
            $Server = $Response.Server
        }
        else {
            $Server = ""
        }
    }

All of this data is then made into an object and returned:


    # creates an object with properties from the html data
    $SiteData += New-Object -TypeName PSObject -Property @{
                                    "URL" = $URL
                                    "RedirectURL" = $RedirectedUrl
                                    "Title" = $Title
                                    "Server" = $Server
                                    }

    return $SiteData

This next part was challenging for me. I knew from the start that it would be necessary to make this script multithreaded, so this was my first experience with PowerShell Runspace Pools (multithreading). The most helpful thing that I read was this: https://www.codeproject.com/Tips/895840/Multi-Threaded-PowerShell-Cookbook. It included relevant examples and really helped me understand how to implement multithreading in PowerShell. Here is what it looks like in my code. Essentially, you’ll have to imagine that all of the code I’ve mentioned so far has been in a $ScriptBlock.

I initialize a pool where my threads will launch, and an empty array to store the thread data:


$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, $Threads)
$RunspacePool.Open()
$Jobs = @()

I then start iterating through my list of URLs starting a new job for each URL, with the maximum number designated by the number in my $Threads variable:


    ForEach ($URL in $URLs) {

        # maps the command line options to the scriptblock
        if ($Proxy -and -not $Info) {
            $Job = [powershell]::Create().AddScript($ScriptBlock).AddParameter("Url", $URL).AddParameter("Proxy", $Proxy)
        }
        else {
            $Job = [powershell]::Create().AddScript($ScriptBlock).AddParameter("Url", $URL)
        }
        
        # starts a new job for each url
        $Job.RunspacePool = $RunspacePool
        $Jobs += New-Object PSObject -Property @{
            RunNum = $_
            Job = $Job
            Result = $Job.BeginInvoke()
        }
    }

All the data needs to be combined, so a $Data array is initialized, and the result of each thread/job is added to that array, which is displayed to the user (right away if the -Info switch is used, otherwise at the end).


    # combine the return value of each individual job into the $Data variable
    $Data = @()
    ForEach ($Job in $Jobs) {
        $SiteData = $Job.Job.EndInvoke($Job.Result)
        $Data += $SiteData

        if ($Info) {
            if ($SiteData) {

                # transform hashhtable data into string without column header
                $SiteDataString = $SiteData | ForEach-Object {
                     "[+] {0} {1} {2} {3}" -f $_.URL,$_.RedirectURL,$_.Title,$_.Server 
                     }
                Write-Host "$SiteDataString"
            }
            else {
                Write-Host "[-] Site did not respond"
            }
        }
    }
    
    # display the returned data
    $Data

I must admit it took me a while to wrap my head around this, and took a fair amount of reading, but I’m happy with the result.

As mentioned, the code is located in my Github. Feedback is welcome as always.

Leave a Reply

Your email address will not be published. Required fields are marked *