Shodan: Searching the Internet of Things

Shodan took the Internet by storm in 2009, when creator John Matherly launched a search engine that allowed users specific types of computers that were connected to the Internet.  It really gave an insight into the amount of devices that are constantly being plugged directly into the Internet for everyone to view without proper security considerations.  Mostly the project was intended to shed light on the number of Internet of Things (IoT) devices that were unsecured, but also touts "Use Shodan to discover which of your devices are connected to the Internet, where they are located and who is using them."  

Shodan works by scanning random IP addresses and random ports on the Internet and performing what is known as Banner Grabbing.  Banner grabbing is defined as a technique used to gain information about a computer system on a network and the service running on its  open ports.  An open port is defined as a network port that allows traffic through a firewall and has a service actively listening for connections. This is the inverse of a closed port which means that traffic would be passing through a firewall, but does not return data either by purpose or due to no network services listening.  Banner grabbing can be performed manually or in an automated fashion like Shodan.  The manual to perform a banner grab is by using a network utility like telnet or Netcat.  For example, a banner grab against this web server might return the following data:

$ nc 35.237.35.24 80
GET / HTTP/1.1

HTTP/1.1 400 Bad Request
Server: nginx
Date: Mon, 30 Jul 2018 23:12:47 GMT
Content-Type: text/html
Content-Length: 166
Connection: close
X-Xss-Protection: 1; mode=block

Shodan works by performing the above banner grabbing action over and over for all possible combinations and ports of the Internet.  Looking at even the full range of an IPv4 address, which has 4,294,967,296 combinations before ever factoring in the 65535 different ports that can be on each IP address you can see how this might take a long amount of time to scan.  This is why Shodan distributes the work to multiple nodes known as a web crawler. A web crawler, or sometimes known as a spider, is an Internet bot that systematically browses different IP addresses with the purpose of indexing.  These  crawlers are distributed around the world, and data is passed back to a centralized server.  The distribution of crawlers around the world also helps with country-wide blocking that might affect the data gathering.  

In order for Shodan not to cause any outages or unattended consequences Shodan's basic algorithm works by picking a random IPv4 address and then a random port.  It then will do a banner grab and see if the device responds in a way that is known.  Then it starts over with picking a new random IPv4 and random port.  This helps creates in a way a degree of randomness so that the scan is not incremental of network ranges.  

Even though the full collection of data acquired by Shodan is not available what makes Shodan great is the ability to search the collection for specific IP ranges or specific network services.   At this point I would like to acknowledge that if you would like to have more in depth knowledge of Shodan I highly recommend the book written by the creator himself John Matherly for sale on LeanPub.  It's well worth the $5.00 USD minimum required to purchase the book.  However, I will be describing some of the more simplistic searches, and will demonstrate how to integrate Shodan API with Python scripting.  

If your goal is to only find a certain bit of information or just to demonstrate the power of Shodan you can simply navigate to www.shodan.io using any newish web browser, and will be greeted with the Shodan homepage.  In the top left hand corner is a simple search box that can be used to search for common terms.

To demonstrate Shodan we will begin with a simple search for VNC servers to see if Shodan can find any Internet accessible VNC devices.  Depressingly enough, the results are staggering.  

Now we can take note of several different portions of the webpage.  Shodan provides search results with a Result map, top services (ports), Top organizations (ISPs), Top Operating Systems, and Top Products (Software name).  

Finally, in the main page we can see the results of search result.  The results portion is broken up into different sections which include information such as: IP address, hostname, ISP, When Shodan Found the item, origin country, and the banner itself.

Additionally, Shodan allows us to be more detailed in our searching ability with certain filters.  We could find information that relates to specific cities, country, geography, hostnames, operating systems, or even if Shodan has a recorded screenshot.  This is where the true power of Shodan shines.  Using the has_screenshot filter we can look only for VNC servers that have recorded screenshots.  Unfortunately, Shodan will block us from making more specific searches without being logged in first.  This is for accountability, but also because Shodan has started becoming a viable business entity and wants to monetize their enterprise.

Using an already created account I will perform the search webcam has_screenshot:true.  Take note that there is not a space between has_screenshot and true.  This is specific to the syntax for Shodan.  

As you can see from the screenshot above Shodan has found many different webcams with screenshots that have been obtained in the past.  The screenshot is showing what appears to be a road in Malaysia, and is described in the HTTP Header information as a IP Webcam Server 0.4.  Webcams are only the tip of the iceberg for Internet connected devices with screenshots.  Some other popular searches are SCADA connected devices, traffic lights, etc.  This proves how interesting and truly frightening the IoT landscape is.  

As mentioned above, Shodan has different levels of accounts.  There are also premium accounts that allow you to export data to use in offline or independent research.  With the premium account you also get access to an API key that can be used to access Shodan with scripting languages such as Python.  For the length of this blog post I will not dive into the full setup of a Shodan API or using a Slack API, but this is a script I created in order to send a random webcam screenshot to a Slack server I am apart of.  Feel free to take it and modify it to your needs or wants.  

#! /usr/bin/env python

import shodan,sys,os,urllib,requests
from slacker import Slacker
from random import randint
from PIL import Image

SHODAN_API = "Shodan-API-Key-Here"
slack = Slacker('slack-API-Key-Here')

def shodanMessage():
        total = api.count('has_screenshot:true')['total']
        page = randint(1, total / 100 + 1)
        results = api.search('has_screenshot:true', page=page)
        num_results = len(results['matches'])
        shodan_result = results['matches'][randint(0,num_results)]['ip_str']
        url_format = "https://www.shodan.io/host/{}/image".format(shodan_result)
        return url_format

def slackSend(url_format):
        slack.chat.post_message('#random', "Here is a Random host with a screenshot from SHODAN {}".format(url_format), username="ShodanBot")

if __name__ == "__main__":
        api = shodan.Shodan(SHODAN_API)
        slackSend(shodanMessage())

Until next time!  Keep on Hacking!