Entire Internet Technology Insights – Jan 2015

This Entire Internet update is based on 306 million resolvable domains on the Internet, this includes 100% .com/.net/.org coverage. We did a similar entire internet insights for Jan-June 2014 so this will include some updates on that, this year we’re going to base it on categories to show the winners and losers in each.

Top CMS Growth

CMS Increases on Entire Internet

CMS Increases on Entire Internet

WordPress is the number 1 CMS system currently in use, it increased its usage on more than 2 million domains since June 2014. Wix, Squarespace, Weebly and various other hosted web solutions increased their market share in the hundreds of thousands as well. Two of the biggest losers are Yahoo Site Builder and Website Tonight.

Top Analytics Growth

Entire Internet Analytics Increase

Entire Internet Analytics Increase

Google Analytics dropped by a few million domains in our Jan-June 2014 update, it’s back up again this year with an increase of +2 million sites. A/B Testing company Optimizely increased by a huge amount, this may be single site owners placing their snippet on their entire network (this information can be found in BuiltWith Pro). Facebook Domain Insights increased their numbers by more than 300,000 and Chinese hit counter 51.la added half a million new domains to their network.

Less domains are using premium tools across less premium tool providers.

Less domains are using premium tools across less premium tool providers.

More premium technologies are losing market share within the analytics category than gaining, meaning there might be some consolidation as to what premium technologies are in use across all premium technologies tagged in analytics. More domains lost premium domains than gained, a deficit of around 154,000 domains removed premium analytics technologies altogether.

Top JavaScript Growth

JavaScript Usage Increase Across Entire Internet

JavaScript Usage Increase Across Entire Internet

jQuery increased its market share to 50 million domains, adding an additional 2.7 million with no end in sight for its incredible growth. Modernizr and HTML5Shiv from Paul Irish’s team added over a million new domains between them. Internet Explorer has seen a resurgence as of late, with 814k domains adding the ability to pin the website to the Windows Start Bar. The biggest losers include MooTools which hasn’t seen a new version since August 2014 and SWFObject, the ability to add Flash content to a website (which is not supported on Apple devices).

eCommerce Growth

Shop Growth

WooCommerce was the real only standout eCommerce platform to show growth in the hundreds of thousands, almost making up for 50% of all of the new eCommerce platforms we found on the web. 103 eCommerce technologies increased customers whilst 121 of them lost customers.

Top 5 Premium Tools Control nearly 3/4 of space.

Top 5 Premium Tools Control nearly three quarters of the premium eCommerce space.

The top 5 eCommerce technologies, WooCommerce, Magento, Shopify, BigCommerce and Volusion make up 73% of the premium technology eCommerce landscape with 152 other premium providers sharing the remaining 27%.

Font Growth

Old libraries like sIFR and cufon are losing market share whilst new implementation tools like the Google Font API and Font Awesome have added almost 6 million new domains between them.

Mobile Growth

Viewport Meta Increase in 6 months.

Viewport Meta Increase in 6 months.

Every single mobile categorized technology increased in usage, with the viewport meta tag adding 4.5 million domains, this is up from the last 6 month increase of 3.7 million domains, showing an increase in sites that are mobile compatible as the proliferation of smart phones continues.

Document Format Growth

Document Standard Growth

Document Standard Growth

HTML5 sites increased by 10 million domains and was the largest technology increase we have recorded in the last 6 months. 1 million more domains are also using Twitter Bootstrap and just under 2 million have implemented Open Graph Protocol. 2.5 million less domains are using the Frameset tag and 1 million less sites are using the Meta Robot tag. 35k sites also removed the blink tag thankfully!

ZX Spectrum Resurgence

Attribution: Flickr

Attribution: Flickr

My first computer was a ZX Spectrum, it’s almost as old as me, it has a 3.5MHz processor with 16KB of memory. There’s a ZX Spectrum HTTPD meta tag (that is most likely a funny) that is returned by some websites. Last year we found 9 sites from just under 300 million domains, this year that number is around 64k!

Technology Consolidation

At our last update we noticed that more premium technologies were losing domains, whilst less premium technologies were gaining more domains.

This year 588 (54%) premium technologies lost customers whilst 510 (46%) gained. The 46% that gained added 2,075,729 domains and the 54% that are losing customers lost 727,509 domains in total. This shows a continuing technology consolidation trend from the beginning of last year, with a slight slow down in the amount of losing technologies but a large increase in the amount of domains gained by the winning technologies.

 

 

New Market Share Graphs

Happy New Year! We’ve just released a new version of our customer market share page, this new version uses our comprehensive historical data to give you a free view back in time to see the movements between similar technologies –

Gaining and Losing Market Share

Everyone has access to this new information, with premium and trial accounts seeing more data than logged out users. To access this data click “Customers and Market Share” when searching for technologies, or the Full Report button on technology Usage Statistics pages. Here’s an example of one of the pages.

It also shows you the international breakdown of where the customers for a technology are –

topPhysicalLocations

This breakdown is also available with a higher level of customizability and access to the actual list of sites in our Pro Product.

Happy Holiday Season from BuiltWith

Thanks everyone for a great year, BuiltWith 2015 will be even better with new features, more analysis and more coverage!

xmas

 

BuiltWith’s Operating Hours Over the Holiday Break -

December 24th – Open 24 hours
December 25th  – Open 24 hours
December 26th – Open 24 hours
December 31st – Open 24 hours
January 1st – Open 24 hours

 

Net New Reports

Every day we re-index millions of websites meaning our technology coverage changes very frequently.  Basic, Pro and Enterprise customers will now on a weekly basis get an overview email detailing Net New results in their dashboard reports.

netNewReport

This lets you know what reports contain new sites we’ve never seen before using the technology and allows you to quickly jump to those reports to further research those new leads.

Learn more about our data coverage frequency in our “Moving to Real Time” blog post.

Lead Research

Improved research for Qualified Contacts

The new lead research feature lets you qualify people’s emails as well as find their social presence in one easy click. Lead Research can be access by clicking qualified results on reports or by accessing it via your dashboard. It is a free tool for Basic, Pro and Enterprise users.

Lead Research

The screenshot above shows the social and contact information for me. Basic, Pro and Enterprise customers can access lead research by clicking on names in report results or by typing in arbitrary data on the lead research tool that’s available here.

You can also watch a quick screencast that demonstrates the Lead Research feature of BuiltWith Pro.

This is just one of the new features launched in December, we’ve got a plenty more new features coming all driven by customer feedback. If you’d like to have something added or have an awesome idea let us know!

Salesforce Coupling

BuiltWith Pro and Salesforce customers can now enjoying tighter synchronization between the two tools – if you have enabled bulk functionality on your account, you will now quickly be able to access accounts and leads from the BuiltWith report view –

Click direct to Salesforce Leads and Accounts from BuiltWith

Click direct to Salesforce Leads and Accounts from BuiltWith

Clicking on the lead/account icon next to the result will take you directly to that lead/account within your Salesforce interface. This new functionality is live for Basic, Pro and Enterprise customers using the Bulk Update Salesforce feature.

A Website in the Forest

“If a tree falls in a forest and no one is around to hear it, does it make a sound?”

Who connects to a website that you’ve told no one about?

Because we know that we will visit websites frequently, we wanted to see who else and how quickly other companies were visiting newly registered websites and who exactly they were and why they were doing it. We also wanted to see who knew about domains that were much harder to trace.

How different TLDs differ in public availability

All new gTLDs (.zone .pro etc..) and some of the major TLDs (.com .net .org .biz) have daily updatable lists accessible to almost anybody, so inevitably if you register a domain within one of these zones it will be public knowledge within a few hours of registration, regardless of how obscure the domain is.

ccTLDs (.uk .pw .de) operate differently, they don’t generally make the list of registered domains publically available.

The Domain Name Setup

We registered two domains – one a .com and another a .pw – .pw is the top level country code for Palau a small group of islands in the Pacific Ocean – the ccTLD is however sold by “The Professional Web” a subsidiary of Directi, an internet registrar with 1000+ employees as an open ccTLD. We choose this ccTLD because it was the cheapest to register.

Both domain names are registered as 12 random characters and numbers basically making it almost impossible for anyone to guess the domain names.

The Website Setup

We built a simple HTML5 valid website which contains –

  1. The first four lines of Prometheus by Lord Byron in a paragraph tag
  2. A signature for an expensive marketing automation tool – for our own testing
  3. A single PNG image
  4. A linked CSS stylesheet
  5. A linked woff font
  6. A linked JS file
  7. An invocation of a web socket
  8. An HTML5 video tag with mp4/ogg/webm and 3gp supported files
  9. Inline JavaScript which makes an AJAX call

Sites that are “scraping” will generally just load the HTML content of a page, skipping the images, JavaScript, CSS, videos and definitely not perform any AJAX requests or opening websockets. A modern web browser would call most of the files linked above. We wanted to make sure we could see the difference between scrapers and an actual web browser in the logs for requests.

The Leaks

Registering a domain in absolute secrecy is almost impossible, here’s the location of leaks before we’ve really even got started –

  1. Namecheap – by registering the domain here this company is aware of our domain
  2. Google – The confirmation of registration was delivered to us by Gmail – Google scan the content of emails to determine the advertisements – they also potentially visit website links in the email to determine if the website has malicious content.
  3. All DNS server owners between Telstra and Namecheap – we do a test lookup using wget on Ubuntu Linux from our broadband plan in Australia with Telstra to ensure the web server is setup correctly, this performs a DNS lookup on our ISPs servers that will then look for a recently cached version of the IP address, which, if no other lookups have been done, would reside on Namecheap’s DNS system.
  4. Directi – The registry owner for the PW TLD – they have a record of the registration of the domain and Verisign, the registry owner for the COM TLD.

Who visited us – Tin Foil Hat Required

We expected to get traffic to our .com registration within a few hours because of its visibility on publically available zone files, it was just a matter of who and when. All queries within the first 28 hours were bots pretending to be PC browsers wit the exception of one. Here’s what happened –

.COM Domain Registered and IP points to web server @ 4:20 UTC on 4th November 2014.

+4 minutes – Telstra (our ISP)
This is us testing to ensure that the server is setup – this also leaks the existence of the domain into the DNS resolution system.

+3 hours 57 minutes – Peer 1 – Unknown
A connection from Peer 1 machine hosted in Vancouver. Blacklist logs for this bot show it changes its useragent frequently.

+6 hours 45 minutes – Prolocation – Unknown
A connection from a machine in the Netherlands that resolves to noc.prolocation.net, so appearing to be a part of this hosting companies business.

+7 hours 17 minutes – Cyveillance – QinetQ Company
This US based company provides threat intelligence and internet monitoring, its parent company QinetQ is a UK company formed from the privatization of parts of the Ministry of Defence and Defence Evaluation Research Agency.

+11 hours 32 minutes – Hurricane Electric  – Unknown
This is a strange one. The IP of this visit resolves to a single .com domain name – the WHOIS record for it shows it’s owned by managing director of a US aviation and insurance company but uses a Chinese DNS provider. Searching for registrations from the same person turned up a few other domain names all of which were registered around April 2014 and all contained Chinese content. Based on the ease of accessibility of the person’s contact information (home address, phone number, wife’s name) via simple Google searches we don’t believe that the owner of these DNS records matches the person controlling whatever this is connecting to the website.

+12 hours 2 minutes – Cyren (Commtouch)
Cyren is a cloud-based security solution provider, it lists Google as one of its customers saying that it provides them with “embedded antivirus solution to protect customers from malware” – as noted in the “Leaks” section above we received the confirmation of the domain registration via Gmail – that may have caused this connection to the website.

+14 hours 5 minutes – DomainTools
DomainTools provides DNS research, WHOIS lookup and cybercrime investigation services.

+18 hours 37 minutes – BuiltWith
That’s us! Our window for finding new sites should be no more than 24 hours.

+26 hours 47 minutes – Prescient Software Inc / IRS
An IP with a DNS record of phishmongers.com that has TXT records pointing to the IRS that seems to be linked to Prescient Software Inc – you can Google that one for conspiracy theories.

What about the .PW Domain? Tin Foil Hat not Required

24 hours after registering the .PW domain in exactly the same manner as the .COM domain, not a single bot has visited the site (except for our own test to ensure the domain is setup correctly).

Conclusion

In just under 27 hours from registering a .com domain 8 different entities visited the website. Half of them  (Cyren, Cyveillance, DomainTools and BuiltWith) are all known companies that advertise the fact they do this, the other half are either unknown or don’t advertise why they are indexing websites in such a short period of time since registration. All of the bots except the Prescient Software one pretended to be web browsers, none of the bots actually were (no CSS/JS or media requests were loaded).

24 hours after registering a .PW domain in the same manner as the .COM there have been no visits from any bots at all which isn’t what we thought would have happened, the leaks of data that happen when you register a domain provides information to some companies about that domain.

Missing from our logs are search engine bots, this must mean that search engines are not using domain registrations as a source of new crawler indexes, at least not relatively quickly.

We will continue to monitor the visits to the websites and see what happens in the days and weeks to come.

 

 

 

 

 

 

 

 

Google Universal Analytics in eCommerce

Web Analytics consulting company Tatvic recently used BuiltWith data to do a comprehensive study of eCommerce sites using Google’s Universal Analytics tag. We’ve been tracking Google Universal Analytics since January 2014 and have seen its popularity increase since then as it became the de-facto method for adding Google Analytics to a website.

Internet Retailer published the findings on their blog “Are online retailers embracing Google’s new Universal Analytics?” finding that the majority of eCommerce stores had implemented UA at some stage in 2014.

One of the interesting things that came out of the migration rate report is the speed at which hosted solutions had migrated their customers, potentially automatically, to UA as shown below –

Adoption of UA by Platform

Adoption of UA by Platform

You can download the full Whitepaper from Tatvic for more information.

 

One Year of eCommerce Data Analysis

We’ve just completed one full year of eCommerce Sales Trends data –

200,000 web stores eCommerce Data

Sales data from hundreds of thousands of web stores 

Some interesting things to note about global online shopping trends –

  • Online shopping ramps up in Late October
    Most online purchases happen in October/November/December with a very sharp drop off in January – this time of the year it is customary in many cultures to exchange gifts – November sees a sharp raise thank to “Black Friday/Cyber Monday” sales
  • January Bounce Back
    Many retail stores have “January Sales” – discounting may times which shows the bounce back to increased sales in Mid-January to February
  • April-September Quite Period
    Online retail sales don’t really increase of decrease much in this period.
  • UK and Australia have similar shopping patterns
    between June and August 2014 both UK/AU based stores had the same peaks – this is interesting from a seasonal perspective as they are not aligned (UK Summer = Australian Winter).

We’re going to monitor how much shopping trends differ from year to year going forward as well as provide new data on the increase or decrease in retail sales based on historical trends, available at builtwith.com/ecommerce soon!