Team Project Proposal for ASE -- Reverse IP and domain name lookup

-- By Bojie Li

How many domains are there matching “*windows*” pattern?

How many websites (domains) are hosted on my shared server?

How many domains are owned by a specific person or company? How many “*windows*” domains do Google own?

Get unprecedented details about a domain name or IP address, just in a click.

NABC Analysis

Need

Web site owners all had a hard time finding perfect domain names. Not only do we want to search existing domain names matching some pattern, but also are we wondering a list of domains owned by my competitor. DNS and Whois system holds all information about a domain, but given the distributed nature of domain system and the enormous size of domain registry, no one has been able to do arbitrary query in the “domain database”.

Approach

USTC have signed contract with Verisign, Inc. and several other Top-Level Domain registry holders for access to full database including all .COM, .NET, .ORG and .INFO domain names. However, to get the email and registrant of a domain name, one have to query the decentralized whois database, which is rate limited for each IP. But our research group have access to PlanetLab, a globally-distributed cluster of ~560 nodes, which can be used to bypass the per-IP query rate limit.

Eventually we can get a big table whose columns are domain name, name server, IP address, registrar, registrant name, registrant email and contact info. The rows are ~100 million top-level domains.

For wildcard match of domain names (e.g. “*windows*”), we can use Gist index of PostgreSQL, which is fast enough to handle ~100 million rows. Other query types are straightforward in SQL.

This project will not require too much coding. The major tasks are:

  1. Gather data, preprocess the data to remove noise.
  2. Design Web UI to build a safe and comfortable query interface.
  3. Monitor system load and do performance tuning if needed.

Benefit

  • Know detailed contact info about a domain name
  • Query existing domain names matching a pattern
  • Know geographical location and AS number of an IP address
  • Find websites hosting on a specific IP address
  • Find neighborhood websites in terms of IP subnet
  • Find domains owned by a specific email address or company
  • Get best knowledge of an IP address or domain name on one page

Competitors

Our most famous competitor would be DomainTools.com, Alexa global rank 263 (9/26/2013). To my knowledge, it is the only web service to provide reverse IP and domain name lookups. But DomainTools is not completely free. Query frequency is strictly limited, and some results are hidden or obscured for free accounts. For example, it show at most 3 results for reverse IP lookup; most letters in domain names are obscured in reverse whois lookup.

The high traffic of DomainTools implies a large requirement of IP and domain name analysis. So I personally do not regard DomainTools as a competitor but a proof-of-concept which adds belief to that our service will attract a substantial amount of users.

posted @ 2013-09-26 20:31  CodeBreaking  阅读(432)  评论(0编辑  收藏  举报