Nmap Scan to CSV

I wrote a Python3 script that can parse an Nmap XML file to CSV output, as well as perform other useful functions. The code is on my Github, along with usage instructions. This post provides some background and basic usage, as well as a basic explanation on how some of the code is implemented.

Background

Nmap is probably the most well-known network scanner, but surprisingly few options exist to convert the scan output to a CSV file. When dealing with a large number of hosts, my preference is to analyze the data in a spreadsheet, where I can sort and filter the data. While the main purpose of the script is to convert the scan.xml file to a scan.csv file, I also included a few functions that I find useful.

Basic Usage:

Converting nmap_scan.xml to nmap_scan.csv is simple:

python3 nmap_xml_parser.py -f nmap_scan.xml -csv nmap_scan.csv

The parser ignores hosts that are down and ports that are not open, as well as several other data elements available in the XML file (but that I didn’t care to include). Here is what is captured in the CSV file:

Useful Functions:

Display all IP addresses that are Nmap identified as up:

python3 nmap_xml_parser.py -f nmap_scan.xml -ip

The -ip switch just prints the IPs. Pipe it into a file, and you have a nice list of IP addresses that you can use in any number of other tools

Display the least common (or most common) open ports:

python3 nmap_xml_parser.py -f nmap_scan.xml -lc <number>

I built these functions so I could see the frequency of least occurrence came up with anything good. This examines all of the open ports and their number of occurrences and returns the n least occurring open ports (use -mc for most common).

Display IP addresses that have a specific port open:

python3 nmap_xml_parser.py -f nmap_scan.xml -fp <port number>

Another nice feature, this allows you to only display IP addresses associated with a specific port. This is a great way to get a list of machines that are running a specific service.

Code Implementation

Parsing the XML:

I won’t go into great detail on how to parse XML with Python, but here is code snippet showing how to parse the IP address information from an Nmap scan.xml file:

>>> import xml.etree.ElementTree as etree   # import the xml module
>>> tree = etree.parse('scan.xml')    # read in the xml to a variable called tree
>>> root = tree.getroot()    # assign the root element to a variable called root
>>> hosts = root.findall('host')    # find all elements named ‘host’
>>>
>>> # loop through the elements, finding the address element and printing its attribute

>>> for host in hosts:
...     ip_address = host.findall('address')[0].attrib['addr']
...     print(ip_address)
...
192.168.1.1
192.168.1.2
>>>

The script basically does this for lots of elements (host_name, os_name, proto, port_id, service, etc.), and creates a data structure consisting of a list of lists (or array of arrays). For example, the data structure may look like this:

[
[ip, hostname, os, proto, port, service, product, nse_output]
[ip, hostname, os, proto, port, service, product, nse_output]
[ip, hostname, os, proto, port, service, product, nse_output]
]

This data structure is then passed to other functions, depending on the arguments provided to the script.

This list within a list data structure is very useful and I use it in many of my scripts. The major concepts to create this is using the append and extend methods.

Using append and extend:

The append method adds an item to a list, while the extend method populates a list with items. Here is an example in the interpreter:

>>> host_data = [] # initialize an empty list for the host
>>> port_data = [] # initialize an empty list for port data
>>> proto = 'tcp'
>>> port = 80
>>> service = 'http'
>>> port_data.extend((proto, port, service)) # extend the existing port_data list
>>> port_data
['tcp', 80, 'http']
>>> host_data.append(port_data) # append the port_data list to the host_data list
>>> host_data
[['tcp', 80, 'http']]
>>> port_data = [] # reinitialize as an empty list to hold new port data. Usually done in a loop.
>>> proto = 'tcp'
>>> port = 443
>>> service = 'https'
>>> port_data.extend((proto, port, service)) # extend the port_data list, using the new port data
>>> host_data.append(port_data) # append the new port_data list to the host_data list
>>> host_data
[['tcp', 80, 'http'], ['tcp', 443, 'https']]

To view the implementation of how this works within the script, check out the get_host_data() function.

This list within a list data structure is also excellent, because it makes it extremely easy to write to a CSV file.

Writing to a CSV file:

Using the example host_data from the above snippet, here is how to write that data to a CSV file:

>>> import csv
>>> csv_file = open('scan.csv', 'w', newline='') # open a new file. Set the newline to ‘’
>>> csv_writer = csv.writer(csv_file) # initialize the csv_writer
>>> csv_header = ['Protocol', 'Port', 'Service'] # define the headers
>>> csv_writer.writerow(csv_header) # write the first line if you want headers
23
>>> for item in host_data: # loop through the host_data variable and write to the file
...     csv_writer.writerow(item)
...
13
15
>>> csv_file.close()

And this is what you get:

To see how I deal with some common error handling, check out the parse_to_csv() function.

2 comments

Leave a Reply

Your email address will not be published. Required fields are marked *