Unlocking the Power of Open Source Intelligence (OSINT): Dive into an Exciting Real-World Case Study!

This is part 2 of our series of articles on OSINT. Find all articles here.
OSINT is the practice of gathering intelligence from publicly available sources to support intelligence needs. In the cybersecurity arena, OSINT is used widely to discover vulnerabilities in IT systems and is commonly named Technical Footprinting. Footprinting is the first task conducted by hackers – both black and white hat hackers – before attacking computer systems. Gathering technical information about the target computer network is the first phase in any penetration testing methodology.
In this article, I will demonstrate how various OSINT techniques can be exploited to gain useful intelligence from public sources about target computerized systems.
Technical Investigation of Target website
By knowing the type of programming language, web frameworks, content management system (CMS) used to create the target website, we can search for vulnerabilities that target these components (especially zero-day vulnerabilities) and then work to exploit any of these vulnerabilities instantly, once discovered.
There are different online services to examine the type of technology used to build websites. To use such service, all you need to do is to supply a target domain name, to have a full list of technical specifications and online libraries/programming languages used to build a subject website. These services also reveal the hosting provider of the target website, SSL certificate register name in addition to email system type. The following are some popular services to use:
In the following screen capture, I use builtwith service to investigate the technical specifications of a target website. This reveals different technical information (see Figure 1) and opens the door to more examination for each technology used to build the subject website. Now, I need to check the list of technical specifications to see if there is unpatched operating systems or outdated content management system with known vulnerabilities that I can exploit to gain entrance to target system.

For example, large numbers of ASP.net websites, use Telerik Controls (https://www.telerik.com) to enrich their design. To find security vulnerabilities associated with Telerik Controls, you can go to https://www.cvedetails.com and search for Telerik security vulnerabilities (see Figure 2).

There are many websites that list security vulnerabilities of operating systems, software and other web applications. The following are the most popular one that we can use to search for common security vulnerabilities and exposures:
Analytics and Tracking
Most websites use Google services to analyze traffic and serve advertisements. We can use this feature to capture all linked domain names. For example, I can find all websites that use the same Google AdSense or Analytical accounts. Dnslytics (https://dnslytics.com/reverse-analytics) is a free online service that finds domains sharing the same Google Analytics ID (see Figure 3).

Target website previous History
In many instances, checking the old version of the target website can reveal important information. For example, an old website version of a corporation may reveal top managements’ email addresses and phone numbers before they got removed from the new version. Wayback Machine (https://archive.org/web) is a good place to start your search for old versions of websites (see Figure 4).

Sub-domain name Discovery
Finding a target website sub-domains is important and can reveal sensitive information about the target such as the VPN portal, email system and FTP server address where some files may have left unprotected. To find all sub-domain names of a target indexed by Google, use the following Google search command (see Figure 5).


Type and versions of IT infrastructure of the target company
Job websites – and any job announcement posted on the target website – should be analyzed to discover the exact IT infrastructure used by the target organization. For example, I conducted a simple search on employee resumes on job websites and was able to capture important information about target organization security systems (e.g. Firewalls and Intrusion Detection Systems), server operating system type, email system, networking devices, types of backup systems and much more (see Figure 6).

Harvest digital files hosted on the target domain name
Using advanced Google search engine techniques (also known as Google dorks) can reveal a great amount of information about the target organizations’ IT systems in addition to confidential files left on the public server. There are thousands of Google dorks and you can practice creating yours. A comprehensive list of Google dorks can be found in the Google Hacking Database (https://www.exploit-db.com/google-hacking-database).
I will experiment using Google dork to locate all PDF files posted on the target website (see Figure 7):

In the above example, I searched for PDF files, however, you can change the file type to something else as you want (doc, docx, xls, txt).
Information contained within files metadata
For each file found on the target website, we should investigate its metadata. Metadata is data about data. In technical terms, it contains hidden descriptive information about the file it belongs to. For example, some metadata included in an MS Office document file might include the author’s name, date/time created, comments, software used to create the file in addition to the type of OS of the device used to create this file. (see Figure 8).

From Figure 8, I found the following facts about the subject PDF file metadata:
- Installed PDF reader Version on the creation device: 1.5
- Application used to create the report: MS PowerPoint 2010 (using the “Save As” function)
- Type of OS used on the target device: Windows
- File creation date/time: July 2017
- Author Name (The person who creates the file).
If the file contains an author name, an additional search could be conducted to lock up more details of the file’s author using specialized people data collection websites. The following lists some popular people search engines:
- Spokeo (https://www.spokeo.com) (see Figure 9)
- Truepeoplesearch (https://www.truepeoplesearch.com)
- Truthfinder (https://www.truthfinder.com)
- 411 (https://www.411.com)

Figure 9 – Using SPOKEO to lock up information about people you know
Email naming criteria
To predicate the naming criteria used by the target organization when creating new email accounts, we should investigate the naming of current email addresses. For example, many organizations use the following naming criteria:
- Most common patterns of naming new emails: {first}(DOT){last first three characters}@exampleWebsite.com
- Other naming criteria include: {first}@exampleWebsite.com
I usually use this website https://www.email-format.com to find the email address formats in use at thousands of companies.
Leaked Credentials
Leaked accounts credentials are spread everywhere online, especially in the darknet. For example, pastebin websites (see Figure 10) contain a vast amount of leaked credentials. Anonymous file sharing websites, such as https://anonfile.com (see Figure 11) also contain large numbers of leaked credential files with billions of records.


Conclusion
In this article, I tried to give a brief overview of OSINT capabilities and how to use it to gather useful intelligence about different entities.
In today’s information age, having OSINT skills is something great to have, however, there are many things – or prerequisites – you should master in order to make your OSINT search rich and effective. For instance, before you begin your OSINT search, you should learn how to conceal your digital identity and become anonymous online. This is essential to prevent threat actors from discovering your search activities. OSINT is strongly related to Digital Forensics and knowing basic information about digital forensics operations will also prove useful when conducting OSINT gathering activities.
In the next article, I will cover how to assure your online privacy, I will talk about the different tracking techniques – currently employed – to track and profile Internet users and how to avoid them, I will also explore web layers and teach you how to access the Darknet in addition to using anonymity networks such as the TOR network to surf the ordinary web anonymously.