Study: How much do URLs matter when cross-referencing the Internet Retailer® Top 500 Guide® with Jigsaw?

AUTHOR: David Eisaiah Engel
DATE: July 14, 2010

Some marketers can benefit by cross-referencing lists like the Internet Retailer® Top 500 Guide® or the Fortune 1000 with Jigsaw® to get contact information for employees at those companies. We tested how including a URL field along with a Company Name field affected Jigsaw® List Builder’s ability to match Internet Retailer® Top 500 Guide® companies in Jigsaw’s database.

Methods

For our experiment, we created a .csv file and populated each row with company names from the page titled “2010 List of Internet Retailer Top 500 Retail Websites.” In an adjacent column (or “field”), we added each company’s URL.*

We created three separate .csv files from this Internet Retailer® data.

  • File 1 contained the company name only in a column called Company Name.
  • File 2 contained the company name in a field called Company Name and the company’s URL in a field called URL.
  • File 3 contained the company’s URL in a field called URL.

We navigated to the Jigsaw® List Builder by selecting Contacts > Build a List in Jigsaw.

After uploading each file in the Jigsaw® List Builder, we filtered results by each level of accuracy and logged the result.

Figure: The area in the Jigsaw® List Builder to upload a list of company names


We clicked the gray ‘Reset’ on the bottom of the Jigsaw® List Builder after calculating our results for the uploaded file. This purged the previous .csv file out of the list builder, allowing us to upload the next file.

Results

File 1: Company Name only for 500 companies

  • 149 matches above 90% accuracy
  • 161 matches above 80% accuracy
  • 175 matches above 70% accuracy
  • 189 matches above 60% accuracy
  • 211 matches above 50% accuracy
  • 232 matches above 40% accuracy
  • 257 matches above 30% accuracy
  • 318 matches above 20% accuracy
  • 456 matches above 10% accuracy

File 2: Company Name & URL for 500 companies

  • 486 matches above 90% accuracy
  • 486 matches above 80% accuracy
  • 486 matches above 70% accuracy
  • 486 matches above 60% accuracy
  • 486 matches above 50% accuracy
  • 486 matches above 40% accuracy
  • 486 matches above 30% accuracy
  • 486 matches above 20% accuracy
  • 486 matches above 10% accuracy

File 3: URL only for 500 companies

  • Did not allow us to upload list with URL only.
  • Required company name.

Interpretation

The results show that the highest-accuracy category of Jigsaw matches increased by 226.17% when cross-referencing the Internet Retailer® Top 500 Guide® by Company Name and URL. Put another way, we were able to find 337 more companies at a 90% accuracy level by including a URL field along with the Company Name field in the .csv list of company names.

The fact that File 2 returned 486 matches across all accuracy levels indicates that URL is a precise way to find companies in the Jigsaw® Database.

Further Thinking

It is common for companies to run multiple websites with the same executive team. For example, Chelsea & Scott Ltd operates http://www.onestepahead.com and http://www.leapsandbounds.com. While these may be legally registered as subsidiary companies, Jigsaw® should store them as the same company because it’s likely that the same executive team operates both sites.

To enhance Jigsaw’s ability to match companies by domain, Jigsaw® should consider giving users more incentive to add ‘Alternate Domains’ to a company’s Family Tree. As of today, a user gets only five (5) points whether they add one (1) alternate domain or 100 alternate domains.

Jigsaw® should also give users the option of exporting a company’s alternate domains when exporting lists of Companies and Contacts. Perhaps that looks like adding a field in exported lists called ‘Alternate Domains’ which would contain a semi-colon-separated list of all the ‘Alternate Domains’ listed in the company’s family tree.

How could Jigsaw® users benefit from an exported list of ‘Alternate Domains’?


* URLs were appended by doing web searches by each company’s name. They were not taken from the Internet Retailer® Top 500 Guide®. At the time of extraction, there were no terms and conditions prohibiting us from copying the company names on this page <http://www.verticalwebmedia.com/top500/list.asp>

Jigsaw® is a registered trademark of Jigsaw Data Corporation. Internet Retailer® is a registered trademark of Vertical Web Media. The Fortune 1000 is published by Time Inc’s Fortune | Money Group. Neither the author nor this publication is affiliated with Vertical Web Media, Jigsaw Data Corporation, Time Inc’s Fortune | Money Group, or Chelsea & Scott, Ltd.

Creative Commons License

Study: How much do URLs matter when cross-referencing the Internet Retailer® Top 500 Guide® with Jigsaw? by David Eisaiah Engel is licensed under a Creative Commons Attribution 3.0 Unported License.