| Web Analytics |
Article Index for Web |
Website Links For Web |
Information AboutWeb Analytics |
| CATEGORIES ABOUT WEB ANALYTICS | |
| web analyticsweb analytics | |
| data collection | |
| web technology | |
| analytics | |
| network analyzers | |
| web development | |
| internet advertising and promotion | |
| SHOPPER'S DELIGHT | |
|
WEB ANALYTICS TECHNOLOGIES There are two main technological approaches to collecting web analytics data. The older method, ''logfile analysis'', reads the Logfiles in which the web server records all its transactions. The newer method, ''page tagging'', uses JavaScript on each page to notify a third-party server when a page is rendered by a Web Browser . Both methods are now widely used. Web server logfile analysis Web servers have always recorded all their transactions in a logfile. It was soon realised that these logfiles could be read by a program to provide data on the popularity of the website. In the early 1990s , web site statistics consisted primarily of counting the number of client requests made to the web server. This was a reasonable method initially, since each web site often consisted of a single HTML file. However, with the introduction of images in HTML, and web sites that spanned multiple HTML files, this count became less useful. Two units of measure were introduced in the mid 1990s to gauge more accurately the amount of human activity on web servers. These were ''page views'' and ''visits'' (or ''sessions''). A ''page view'' was defined as a request made to the web server for a page, as opposed to a graphic, while a ''visit'' was defined as a sequence of requests from a uniquely identified client that expired after a certain amount of inactivity, usually 30 minutes. The page views and visits are still commonly displayed metrics, but are now considered rather unsophisticated measurements. The emergence of Search Engine Spiders and robots in the late 1990s, along with Web Proxies and Dynamically Assigned IP Addresses for large companies and ISPs , made it more difficult to identify unique human visitors to a website. Log analyzers responded by tracking visits by Cookies , and by ignoring requests from known spiders. The extensive use of Web Cache s also presented a problem for logfile analysis. If a person revisits a page, the second request will often be retrieved from the browser's cache, and so no request will be received by the web server. This means that the person's path through the site is lost. Caching can be defeated by configuring the web server, but this can result in degraded performance for the visitor to the website. Page tagging Concerns about the accuracy of logfile analysis in the presence of caching, and the desire to be able to perform web analytics as an outsourced service, led to the second data collection method, page tagging. In the mid 1990s, Web Counter s were commonly seen — these were images included in a web page that showed the number of times the image had been requested, which was an estimate of the number of visits to that page. In the late 1990s this concept evolved to include a small invisible image instead of a visible one, and, by using JavaScript, to pass along with the image request certain information about the page and the visitor. This information can then be processed by a web analytics company, and extensive statistics generated. This can be done remotely, by the web analytics company. The web analytics service also manages the process of assigning a cookie to the user, which can uniquely identify them during their visit and in subsequent visits. Until recently, these have usually been ''third-party cookies'' — cookies set by the web analytics company's domain rather than by the domain being browsed. However, privacy concerns have now led a noticeable minority of users to block or delete third-party cookies, so many programs are now using first-party cookies instead. Logfile analysis vs page tagging Both logfile analysis programs and page tagging solutions are readily available to companies which wish to perform web analytics. In many cases, the same web analytics company will offer both approaches. The question then arises which method a company should choose. There are advantages and disadvantages to each approach. Advantages of logfile analysis The main advantages of logfile analysis over page tagging are as follows.
Advantages of page tagging The main advantages of page tagging over logfile analysis are as follows.
Economic factors Logfile analysis is almost always performed in-house. Page tagging can be performed in-house, but it is more often provided as a third-party service. The economic difference between these two models is often the most important difference for a company deciding which to purchase.
Which solution is cheaper often depends on the amount of technical expertise within the company, the vendor chosen, the amount of activity seen on the web sites, the depth and type of information sought, and the number of distinct web sites needing statistics. Hybrid methods Some companies are now producing programs which collect data through both logfiles and page tagging. By using a hybrid method, they aim to produce more accurate statistics than either method on its own. Other methods Other methods of data collection have been used, but are not currently widely deployed. These include integrating the web analytics program into the web server, and collecting data by Sniffing the network traffic passing between the web server and the outside world. There is also an other method of the page tagging analysis. Instead of getting the information from the user side, when he opens the page, it’s also possible to let the script work on the server side. Right before a page is sent to a user it then sends the data. WEB ANALYTICS CONCEPTS Key definitions Hit A request for a file from the web server. Available only in log analysis. Page View A request for a file whose type is defined as a page in log analysis. An occurrence of the script being run in page tagging. In log analysis, a single page view may generate multiple hits as all the resources required to view the page (images, .js and .css files) are also requested from the web server. Visit / Session A series of requests from the same uniquely identified client with a set timeout. A visit is expected to contain multiple hits (in log analysis) and page views. Visitor / Unique Visitor The uniquely identified client generating requests on the web server (log analysis) or viewing pages (page tagging). A visitor can make multiple visits. Repeat Visitor A visitor that has made at least one previous visit. New Visitor A visitor that has not made any previous visits. WEB ANALYTICS METHODS 1st party vs 3rd party cookies Primarily a concern with Page Tagging solutions performed by a third-party. Many web analytics vendors used 3rd party cookies (cookies assigned by their domain) to track visitors on their client sites. 3rd party cookies easily handle visitors who cross multiple domains within the client site - since the cookie is always handled by the vendors servers. However, this 3rd party cookie is also trackable across multiple client sites, theoretically allowing an analytics vendor to create visitor profiles. Most vendors of Page Tagging solutions have now moved to provide at least the option of using 1st party cookies (cookies assigned from the client domain). This generally means that a visitor traveling from www.examplealias.com to www.example.com will be treated as two different Visitors as the cookie from www.examplealias.com is not available on www.example.com. In 2005, some reports showed that about 9-11% of Internet users blocked third-party cookies. Many analytics vendors provide a fall back method when third-party cookies are not accepted, such as using a combination of IP address and time period. Without this, users will generally appear as unique on each page reported on for a company. If unique visitor counts are inflated, other metrics suffer in accuracy too. Another problem is cookie deletion. When web analytics depend on cookies to identify unique visitors, the statistics are dependent on a persistent cookie to hold a unique visitor ID. When users delete cookies it is usually an all or nothing scenario where both first and third party cookies are removed. If this is done between interactions with the site(s), the user will appear as a first time visitor at their next interaction point. Without a persistent and unique visitor id, conversions, click-stream analysis, and other metrics dependent on the activities of a unique visitor over time, cannot be accurate. Cookies are used because IP addresses are not unique to users and may be shared by large groups. Other methods of uniquely identifying a user are technically challenging and would limit the trackable audience or would be considered suspicious. Cookies are the selected option because they reach the lowest common denominator without using spy ware technologies. Unique landing pages vs referrals for campaign tracking Tracking the amount of activity generated through advertising relationships with external web sites through the Referrals reports available in most web analytics packages is significantly less accurate than using unique landing pages. Referring URLs are an unreliable source of information for the following reasons:
SEE ALSO REFERENCES
|