IPv6: Helping You Make the Right Decisions
Version 6 of the IP address specification (IPv6) was first proposed as a draft in 1994 and published in 1995. The draft was accepted as a standard in 1998. However, it took 14 years, until 2012, for World IPv6 Launch Day to trigger a permanent production deployment of the technology across the globe. Since that time, IPv6 adoption across the Internet has increased to 25%, with adoption in some countries of greater than 90%.
As the deployment of IPv6 has increased, the number of devices and people using IPv6 addresses has also grown. Companies that use IP address information to make decisions about user experience, legal compliance, or risk assessment should consider whether it’s time to incorporate IPv6 geolocation data into their toolset.
The IPv6 address space is huge, with 340 undecillion unique IP addresses. This is roughly 10 billion times larger than the IPv4 address space, which itself has over 4 billion IP addresses. To put that in perspective, today a house with a cable modem usually gets a single IPv4 address from their ISP. However, for IPv6, the ISP will likely allocate that same home a /64 subnet, which is a block of over 18 million trillion IPv6 addresses. So, while it is possible for machines to gather information about every IPv4 address on a daily basis, it is not feasible to do this for the allocated IPv6 address space at this time. It is simply too large. As a result, this creates a set of decisioning data challenges for IPv6 geolocation that do not exist for IPv4.
By nature, IP geolocation is inherently imprecise. There is no source of truth for where an IP address is physically located and since IP addresses can be reassigned by the ISP, even a known location can go “out of date” and become incorrect. Because of this, any specific location, i.e. a latitude and longitude, provided by a geolocation database or web service, should not be used to identify a particular address or household. Not to mention the issue of privacy concerns, especially in the European Union, which prevents geolocation services from publishing information that can be used to identify a specific user.
Neustar began collecting IPv6 address allocation information in 2015 in order to begin testing our collection and synthesis algorithms in preparation for eventual publication to paying customers. Our sources include publicly available information, which in many cases gives the city and country of the ISP or corporation that was allocated a block of IP addresses. Fortunately, this high-level information does not change very frequently, so we are able to scan and ingest it quickly enough to capture all of the IPv6 metadata. However, although this information is useful, it is not often correct. This is because a large ISP or corporation may sub-allocate addresses to its customers or users all over the globe.
To address this issue, Neustar relies on other sources of data, such as information from our identity and customer analytics data warehouses, third-party data, and network analytics data that we create by probing the Internet. This data, when ingested into our patented machine learning system, is evaluated by a set of rules that compares and weighs data that correlates, producing a single city, state, country answer.
These data sources provide us with coverage of greater than 99% of all allocated IPv6 addresses. However, because the IPv6 address space is so large, there is no way that we can traverse the massive number of individual IP addresses every day, nor would we, due to the resulting geolocation file size. Because it is highly unlikely that humans will ever run out of available IPv6 addresses, there appears to be much less reassignment of addresses from one device to another. This means that given enough time, geolocation services, that use network probes or other active means of determining IP address physical location, will eventually be able to create a historical map of individual IPv6 addresses, which can be used to validate the location of a current IPv6 address in the same block. Therefore, we synthesize and publish IPv6 location information in blocks addresses, usually as allocated by individual ISPs.
The last thing that we have learned about IPv6 geolocation is that no matter how sophisticated your computer algorithms are, there are going to be mistakes. At Neustar, we handle this by having our dedicated team of Network Geography Analysts involved in every stage of our data processing, from reference data research to post-synthesis quality assurance, to input of geo-feedback from customers and partners. By having a human provide guidance to the systems and make the final decision on production data geolocation, we continue to make our systems more accurate and efficient, enabling us to provide the highest quality decisioning data to our customers.