Using WHOIS Domain Lookup Tools To Identify Malicious Domains And Prove Misuse
A domain is always a good starting point for every cyber investigation. However, investigators must procure additional proof that it is indeed being abused and misused for attacks. IT professionals and law enforcement agents need to look at other online platforms where persons of interest may be active, apart from scrutinizing the digital footprints they left behind. WHOIS domain lookup and other cybersecurity research tools, however, can let them stay hot on the heels of cybercriminals.
Using WHOIS Domain Lookup Tools to Identify Malicious Domains and Prove Misuse
In the world of cybercrime, attackers often run large-scale campaigns or services relying on many domain names to direct users to malicious services or content. Attackers often register these domains in bulk, and these bulk registrations can be identified to provide hints that a domain is likely to be used for malicious purposes. Such hints may be particularly helpful for distinguishing between types of domains using wildcard DNS. In past studies, researchers noted that domains known to abuse wildcard DNS records were often registered in bulk, and a relatively high percentage used the same IP addresses or authoritative name servers. However, there are some challenges to using bulk registration as a key differentiator between benign and malicious use of wildcard DNS records. First, whois records, which provide registration information, often obscure the registrant for privacy reasons, making it difficult to identify bulk registrations. Second, high levels of concentration among domains using wildcard records do not always provide a strong indication of abuse. Some hosting or DNS management providers may provide wildcard records by default or encourage their users to configure their domains with wildcard records. The same providers may also provide infrastructure or authoritative DNS name servers to their clients. These scenarios could easily result in many benign domains with wildcards using the same name servers and IP addresses.
For our detection, we leverage a large passive DNS (pDNS) data set to effectively identify domains using wildcard DNS records and filter these domains based on key characteristics of the domains. Note from the example shown in Figure 2 that the response for doesnotexist[.]example[.]com generated from the wildcard does not show that the wildcard record exists. To figure this out, the user would have to ask the server directly for the IP address of *.example[.]com. Checking all domains for wildcard records is impractical, however. To efficiently search for malicious or suspicious domains, we use passively collected DNS data and hints from previously detected domains to regularly build lists of new domains to be checked.
Using information from whois records allows us to filter out many domains quickly. For the rest, we perform several checks, evaluating characteristics of these domains. The system builds its knowledge base as it runs, iteratively checking domains, and identifying related domains that also use wildcard records, thus allowing us to track entire campaigns using wildcard DNS records for less-than-honest purposes. In the weeks we have been running this detector, we have identified over 4,000 domains abusing wildcard DNS for questionable SEO campaigns, or to promote sites related to gambling, adult content or questionable video streaming sites. The next section explores a few of the cases we identified.
As the number of registrars of domains suspected to be malicious was expected to be very large, a way to prioritize conversations with registrars was needed. Our GCA team sought to identify the top registrars of suspect domains using the information available in the Domain Trust platform.
We define a temporal variation pattern (TVP) as the time series behavior of each domain name in various types of domain name lists. Specifically, we identify how and when a domain name has been listed in legitimate/popular and/or malicious domain name lists. Our motivation for considering TVPs is based on the observation that both legitimate and malicious domain names vary dramatically in domain name lists over time. There are three reasons for using different and multiple domain name lists. One is that the data are realistically observable; that is, we can easily access the data from domain name list maintainers. The second is that domain name lists are created on the basis of objective facts confirmed by the maintainer of those lists. The third is that multiple domain name lists and the time series changes in those lists can boost the reliability of listed domain names.
In terms of early detection of future malicious domain names, we investigated when the system can detect such domain names. Specifically, we analyzed the number of days that elapsed from February 28, 2015, when the learning model was created, for malicious domain names to be detected by the system. For example, if the system correctly detected and identified a new malicious domain name on March 7, 2015, the elapsed number of days for the domain name is seven. Tables 7 and 8 show the descriptive statistics of the elapsed days for malicious domain names for each feature set. Note that we only count domain names in the TP of each dataset shown in Tables 5 and 6. The descriptive statistics include the minimum (days_Min); the first quartile (days_1stQu), which means the value cutoff at the first 25% of the data; the second quartile, which is also called the median and is the value cutoff at 50% of the data; the mean (days_Mean); the third quartile (days_3rdQu), which is the value cutoff at 75% of the data; and the maximum (days_Max). Table 7 shows that our proposed system (TVP+rIP+rDomain) can precisely predict future malicious domain names 220 days before the ground truth, such as honeyclients and sandbox systems, and identify them as malicious in the best case. Comparing the above results with Table 8 reveals that the conventional DNS-based feature set (rIP+rDomain)  also detects malicious domain names early; however, the number of detected domain names (TP) is quite small as shown in Table 6. We conclude that our proposed system using TVPs outperforms the system using only the conventional DNS-based feature set from the perspectives of both accuracy and earliness.
hpHosts-Stable: This TVP is designed to determine the characteristics of malicious domain names abusing easy-to-use services, such as bullet-proof hosting, to improve the TPR. For example, we observed many subdomains using a domain generation algorithm (DGA) under the same 2LD part such as 84c7zq.example.com.
Manadhata et al.  proposed a method for detecting malicious domain names from event logs in an enterprise network by using graph-based analysis. Boukhtouta et al.  proposed an analysis method for creating graphs from sandbox results to understand the relationships among domain names, IP addresses, and malware family names. Kührer et al.  proposed a method for identifying parked and sinkhole domain names from Web sites and blacklist content information by using graph analysis. DomainProfiler strongly relies on the TVP or time series information, which these studies did not use, to precisely predict future malicious domain names. Chiba et al. used the characteristics of past malicious IP addresses to detect malicious Web sites . Our system uses not only IP address features (rIPs) but also TVPs to precisely detect malicious domain names.
Use Cases for a Whois LookupIncident Response and Threat IntelligenceThe most obvious benefits of a whois lookup for those responding to a security incident is identifying the netblock and ISP that owns a particular IP address. From this information the incident responder can contact the owner of the netblock in order to alert the provider to the presence of malicious traffic.
Several organizations offer free online tools for looking up a potentially malicious website. Some of these tools provide historical information; others examine the URL in real time to identify threats:
Parked domains can also be used in phishing attacks. Apart from pointing visitors to malicious sites, said domains become tools to conduct spoofing and elaborated business email compromise (BEC) scams. Consequently, organizations which domains are abused may suffer from brand defamation, causing them to lose customers and possibly even being sued by their partners also taking a hit from the incident. 350c69d7ab