A Website in the Forest

“If a tree falls in a forest and no one is around to hear it, does it make a sound?”

Who connects to a website that you’ve told no one about?

Because we know that we will visit websites frequently, we wanted to see who else and how quickly other companies were visiting newly registered websites and who exactly they were and why they were doing it. We also wanted to see who knew about domains that were much harder to trace.

How different TLDs differ in public availability

All new gTLDs (.zone .pro etc..) and some of the major TLDs (.com .net .org .biz) have daily updatable lists accessible to almost anybody, so inevitably if you register a domain within one of these zones it will be public knowledge within a few hours of registration, regardless of how obscure the domain is.

ccTLDs (.uk .pw .de) operate differently, they don’t generally make the list of registered domains publically available.

The Domain Name Setup

We registered two domains – one a .com and another a .pw – .pw is the top level country code for Palau a small group of islands in the Pacific Ocean – the ccTLD is however sold by “The Professional Web” a subsidiary of Directi, an internet registrar with 1000+ employees as an open ccTLD. We choose this ccTLD because it was the cheapest to register.

Both domain names are registered as 12 random characters and numbers basically making it almost impossible for anyone to guess the domain names.

The Website Setup

We built a simple HTML5 valid website which contains –

  1. The first four lines of Prometheus by Lord Byron in a paragraph tag
  2. A signature for an expensive marketing automation tool – for our own testing
  3. A single PNG image
  4. A linked CSS stylesheet
  5. A linked woff font
  6. A linked JS file
  7. An invocation of a web socket
  8. An HTML5 video tag with mp4/ogg/webm and 3gp supported files
  9. Inline JavaScript which makes an AJAX call

Sites that are “scraping” will generally just load the HTML content of a page, skipping the images, JavaScript, CSS, videos and definitely not perform any AJAX requests or opening websockets. A modern web browser would call most of the files linked above. We wanted to make sure we could see the difference between scrapers and an actual web browser in the logs for requests.

The Leaks

Registering a domain in absolute secrecy is almost impossible, here’s the location of leaks before we’ve really even got started –

  1. Namecheap – by registering the domain here this company is aware of our domain
  2. Google – The confirmation of registration was delivered to us by Gmail – Google scan the content of emails to determine the advertisements – they also potentially visit website links in the email to determine if the website has malicious content.
  3. All DNS server owners between Telstra and Namecheap – we do a test lookup using wget on Ubuntu Linux from our broadband plan in Australia with Telstra to ensure the web server is setup correctly, this performs a DNS lookup on our ISPs servers that will then look for a recently cached version of the IP address, which, if no other lookups have been done, would reside on Namecheap’s DNS system.
  4. Directi – The registry owner for the PW TLD – they have a record of the registration of the domain and Verisign, the registry owner for the COM TLD.

Who visited us – Tin Foil Hat Required

We expected to get traffic to our .com registration within a few hours because of its visibility on publically available zone files, it was just a matter of who and when. All queries within the first 28 hours were bots pretending to be PC browsers wit the exception of one. Here’s what happened –

.COM Domain Registered and IP points to web server @ 4:20 UTC on 4th November 2014.

+4 minutes – Telstra (our ISP)
This is us testing to ensure that the server is setup – this also leaks the existence of the domain into the DNS resolution system.

+3 hours 57 minutes – Peer 1 – Unknown
A connection from Peer 1 machine hosted in Vancouver. Blacklist logs for this bot show it changes its useragent frequently.

+6 hours 45 minutes – Prolocation – Unknown
A connection from a machine in the Netherlands that resolves to noc.prolocation.net, so appearing to be a part of this hosting companies business.

+7 hours 17 minutes – Cyveillance – QinetQ Company
This US based company provides threat intelligence and internet monitoring, its parent company QinetQ is a UK company formed from the privatization of parts of the Ministry of Defence and Defence Evaluation Research Agency.

+11 hours 32 minutes – Hurricane Electric  – Unknown
This is a strange one. The IP of this visit resolves to a single .com domain name – the WHOIS record for it shows it’s owned by managing director of a US aviation and insurance company but uses a Chinese DNS provider. Searching for registrations from the same person turned up a few other domain names all of which were registered around April 2014 and all contained Chinese content. Based on the ease of accessibility of the person’s contact information (home address, phone number, wife’s name) via simple Google searches we don’t believe that the owner of these DNS records matches the person controlling whatever this is connecting to the website.

+12 hours 2 minutes – Cyren (Commtouch)
Cyren is a cloud-based security solution provider, it lists Google as one of its customers saying that it provides them with “embedded antivirus solution to protect customers from malware” – as noted in the “Leaks” section above we received the confirmation of the domain registration via Gmail – that may have caused this connection to the website.

+14 hours 5 minutes – DomainTools
DomainTools provides DNS research, WHOIS lookup and cybercrime investigation services.

+18 hours 37 minutes – BuiltWith
That’s us! Our window for finding new sites should be no more than 24 hours.

+26 hours 47 minutes – Prescient Software Inc / IRS
An IP with a DNS record of phishmongers.com that has TXT records pointing to the IRS that seems to be linked to Prescient Software Inc – you can Google that one for conspiracy theories.

What about the .PW Domain? Tin Foil Hat not Required

24 hours after registering the .PW domain in exactly the same manner as the .COM domain, not a single bot has visited the site (except for our own test to ensure the domain is setup correctly).

Conclusion

In just under 27 hours from registering a .com domain 8 different entities visited the website. Half of them  (Cyren, Cyveillance, DomainTools and BuiltWith) are all known companies that advertise the fact they do this, the other half are either unknown or don’t advertise why they are indexing websites in such a short period of time since registration. All of the bots except the Prescient Software one pretended to be web browsers, none of the bots actually were (no CSS/JS or media requests were loaded).

24 hours after registering a .PW domain in the same manner as the .COM there have been no visits from any bots at all which isn’t what we thought would have happened, the leaks of data that happen when you register a domain provides information to some companies about that domain.

Missing from our logs are search engine bots, this must mean that search engines are not using domain registrations as a source of new crawler indexes, at least not relatively quickly.

We will continue to monitor the visits to the websites and see what happens in the days and weeks to come.

 

 

 

 

 

 

 

 

Google Universal Analytics in eCommerce

Web Analytics consulting company Tatvic recently used BuiltWith data to do a comprehensive study of eCommerce sites using Google’s Universal Analytics tag. We’ve been tracking Google Universal Analytics since January 2014 and have seen its popularity increase since then as it became the de-facto method for adding Google Analytics to a website.

Internet Retailer published the findings on their blog “Are online retailers embracing Google’s new Universal Analytics?” finding that the majority of eCommerce stores had implemented UA at some stage in 2014.

One of the interesting things that came out of the migration rate report is the speed at which hosted solutions had migrated their customers, potentially automatically, to UA as shown below –

Adoption of UA by Platform

Adoption of UA by Platform

You can download the full Whitepaper from Tatvic for more information.

 

One Year of eCommerce Data Analysis

We’ve just completed one full year of eCommerce Sales Trends data –

200,000 web stores eCommerce Data

Sales data from hundreds of thousands of web stores 

Some interesting things to note about global online shopping trends –

  • Online shopping ramps up in Late October
    Most online purchases happen in October/November/December with a very sharp drop off in January – this time of the year it is customary in many cultures to exchange gifts – November sees a sharp raise thank to “Black Friday/Cyber Monday” sales
  • January Bounce Back
    Many retail stores have “January Sales” – discounting may times which shows the bounce back to increased sales in Mid-January to February
  • April-September Quite Period
    Online retail sales don’t really increase of decrease much in this period.
  • UK and Australia have similar shopping patterns
    between June and August 2014 both UK/AU based stores had the same peaks – this is interesting from a seasonal perspective as they are not aligned (UK Summer = Australian Winter).

We’re going to monitor how much shopping trends differ from year to year going forward as well as provide new data on the increase or decrease in retail sales based on historical trends, available at builtwith.com/ecommerce soon!

BuiltWith Responsive Profiles

BuiltWith free profiles and homepage are now responsive! BuiltWith uses Bootstrap and we added the Responsive CSS, tweaked a few things and now voila, BuiltWith works on your mobile, it looks great much better.

builtwith_responsive

We’re adding support for responsive across BuiltWith Trends next!

Web Leads

webleads

BuiltWith Web Leads lets you get domains from customer data entered onto signup forms, contact us pages and other user generated forms on your website into reports within BuiltWith.

An example scenario – A person from a high profile eCommerce website comes to your business, they enter their email address on the contact page but then abandon the process. BuiltWith will have picked the domain of the email address out of the form and inserted it into the Web Leads report on BuiltWith. Using the Technology Market Share tab and Prioritize Report features of BuiltWith Pro you can easily pick out the high value signups/abandonments on your website and highlight them as high priority lead sources.

For more information on Web Leads and how to integrate them into your website visit the Web Leads Setup Knowledge Base Article.

 

WooCommerce Version Distribution

Along with our new Magento Version Distribution stats, we have also added WooCommerce version distribution detection.

The great thing about this detection is over the long term you will be able to see how quickly websites upgrade to newer versions of software. WordPress for example have streamlined their upgrade procedure, doing it without interruption or manual intervention in some cases. You can see this in the speed of uptake for WordPress 4.0.

Version distribution for WooCommerce on Top Sites in October 2014

Version distribution for WooCommerce on Top Sites in October 2014 – Full Size

The most prevalent version of WooCommerce is 2.1, being released 8 months ago it is quickly being caught up by version 2.2 – the trends for these technologies will now allow us to see how fast version uptake is for new versions of WooCommerce. See all versions of WooCommerce we now track.

Pro users will be able to flag sites that have either not upgraded or upgraded to the newer versions shortly after their release using this new data.

Starred Domains

You can now “star” domains in your reports, starring a domain creates a new report on your dashboard that lets you do analysis on just your specific starred domains.

star-leads

The stars will follow you around based on the reports you create.

A few ideas for how you could use them –

  • Highlight your own accounts, making them easier to spot in reports you create.
  • Star results you consider to be worth following up then use the report that is generated based on your star’d results to find new lead sources by understanding the most prominent technology market share partners.

Learn how to use Starred Domains over on the knowledge base or dive in now on BuiltWith Pro.

Using BuiltWith for Technology Research and Advisory

With thousands of SaaS applications, web frameworks and technologies available understanding which you should adopt can be a challenging task. Implementations are costly and time consuming, deploying the right product is important. Feature comparison won’t cut it when making critical IT decisions.

BuiltWith is a powerful tool for assisting with technology research.

Market Share Validation

Sendgrid UsageAll 7,000+ technologies tracked are identified by high level and detailed categories. We report on the adoption of technologies within each category segmented by low, medium and high traffic websites. Across both of these you’ll find available market share analysis in our freely available technology trends data.

BuiltWith Pro goes a step further providing the raw data used to create the trends graphs. This data allows you to select your own collections of technologies upon which to create market share comparisons and perform detailed analysis.

Vendor supplied client list and testimonial validation

Using the free technology lookup tool on our homepage or browser extension you can validate the presence of any Internet facing technology referenced in a vendor supplied list or testimonial.

Knowing which companies are using a technology combined with identification of your contacts within the organisation (via our LinkedIn Integration) opens up unique opportunities for unsolicited customer experience references.

Comprehensive Customer Lists

Vendors will rarely provide full customer lists and the references that are provided may not be representative of their average customers. BuiltWith Pro enables you to see a full list of the companies using any technology that we track.

Our advanced analysis tools enable you to quickly identify key customer groups within these lists. Segment by location, traffic levels, duration of technology use and spend.

Pulse check

Traction and a growing user base are critical to ongoing development and improvement of all applications. Active development is seldom sustained in the absence of growing user base. Our technology trends reports provide the raw insights into product and company performance based on actual installations and usage.

Big Data

Crawling the web on a daily basis we collect vast amounts of data which we make available to you. BuiltWith Pro provides you both with advanced online report building tools as well as raw exports of the data for your own onsite post processing.

Not only does this allow you to test your own hypothesis to find insights and opportunities in the data – on a commercial basis you can utilise our data scientists to assist with more complex data compilations.amazonTopRegions

By knowing every Internet facing technology in every website you can gain an unparalleled understanding of:

  • Interaction between complimentary technologies.
  • Early visibility of emerging technologies.
  • Analyse adoption by industry, vertical, geography and time.
  • Competitive landscape by viewing movements of customers between competing technologies.

 

Independent – Empirical data

Importantly the data collection is carried out by us making it impartial and independent of influence from the technology vendors themselves. The data is free of subjective interpretation and hence non competitive with advisory services.

BuiltWith is regularly referenced as an authoritative source in the media, online and in conference presentations worldwide.

Like to know more? Sign up for a free account or contact us.

Salesforce Bulk Technology Update

You can now use BuiltWith to update all of your Accounts and/or Leads in Salesforce with relevant technology information.

bulkTechnologyInSalesforce

Some advantages of this are –

  • Every Account/Lead has up to date Technology data attached to it
  • It automatically updates all Leads and Accounts on a daily basis with the latest data we have
  • You can create reports from within Salesforce on Leads and Accounts to see which results are using or have stopped using particular technologies as well as see recently update technology information. View a demo of this.

How do I Integrate this?

  1. Enable Salesforce Integration with BuiltWith via the Connected App Setup Guide
  2. Create relevant bulk custom objects within Salesforce
  3. Define your Technologies of Interest on BuiltWith and Enable Bulk Updates

What happens if I cancel my BuiltWith Account?

We will stop updating the records but they will remain in Salesforce until you decide to delete them.

Are there any limits?

There are limits to the amount of Technologies of Interest, on Basic the limit is 10, Pro 50 and on Enterprise there is no limit.

There are also limits on the amount of results we can update in Salesforce, however this a Salesforce technical limitation, please contact us if you have more than a quarter of a million Leads or Accounts you’d like to update.

 

Magento Version Distribution

We have just added version tracking for the popular eCommerce platform Magento to our data coverage and trending info. This lets us get an idea of what versions of Magento are the most used at any given point in time. Below show the breakdown of versions within top sites, Magento Version 1.7 is currently (as of September 2014) the most popular version of Magento in use today.

Magento Version Distribution - September 2014

Magento Version Distribution – September 2014- Larger Version

Going forward you’ll be able to track all versions of Magento and their coverage on a weekly basis, just search for Magento and the version number to find the specific trends report.

You will also be able to download reports on which version of Magento websites are using, letting you find customers that might need help upgrading or get in contact with ones that are working at the cutting edge of eCommerce software.