Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Data

Athena Quarterly Reports: Athena metrics will be included in IS&T quarterly reports as of FY10Q2. The version in the quarterly reports is abridged due to space considerations, the unabridged reports are posted in this wiki.

Statistical data on Athena use: Raw data is available as far back as 1992 in some cases. Both historical and recent data is posted at this page, usually updated on a semesterly basis.

Statistics on printing

Notes

A project is underway to improve our current data sources and create a single clearinghouse for data and statistics. See Athena Metrics Update Project for more information

- - - - - - - - DRAFT - - - - - - -

Problem Statement

There is no single repository for data related to the Athena Computing Environment. We have been collecting usage data for quite sometime, but it has not found its way to the appropriate audience. There is also a desire to know not just how many users are using Athena, but what they are using it for. While we can produce numerous anecdotes, the plural of anecdote is not data. Any data collected needs to be presented in an easy accessible format that accurately represents how the MIT Community uses Athena.

Deliverables

  • A repository of data pertaining to Athena, containing not only usage data, but data about the types of applications being run on Athena.
  • A website that presents the above data to the MIT Community, in an easily readable yet accurate format, such that one can instantly get a view of Athena usage on a monthly basis.

Data

We will need to collect the following data:

Machine Count

We currently receive a monthly count of a the number of active machines, broken down by platform type. This is posted monthly in sysd_stats.

Login Count

We currently receive a count of the number of logins, broken down by quickstations, cluster machines, other machines, and dialups. Unique login counts are also available. This is posted weekly in sysd_stats.

Application Usage

For slw-wrapped applications, we have access to logs which tell us when an application was used, for how long, and the hostname of the machine on which it was run. The hostname of the machine can be used to determine whether or not it's a cluster machine. We know that certain wrapped applications provide incorrect duration information, but we can take that into account when compiling the data. Even without duration information, unique launches are still relevant.

For locally-installation applications, we have no technical solution in place. Work is ongoing (http://debathena.mit.edu/trac/ticket/340) to gather information on the applications used during the login session, but there are privacy concerns to be addressed.

(warning) Note that usage statistics will be skewed once fall term starts and Debathena is used in earnest. For example, users launching OpenOffice from the panel or by typing ooffice will get the locally installed version, and users explicitly using athrun or add -f will get the wrapped version in AFS.