Problem Statement

There is no single repository for data related to the Athena Computing Environment. We have been collecting usage data for quite sometime, but it has not found its way to the appropriate audience. There is also a desire to know not just how many users are using Athena, but what they are using it for. While we can produce numerous anecdotes, the plural of anecdote is not data. Any data collected needs to be presented in an easy accessible format that accurately represents how the MIT Community uses Athena.

Deliverables

  • A repository of data pertaining to Athena, containing not only usage data, but data about the types of applications being run on Athena.
  • A website that presents the above data to the MIT Community, in an easily readable yet accurate format, such that one can instantly get a view of Athena usage on a monthly basis.

Data

We will need to collect the following data:

Machine Count

We currently receive a monthly count of a the number of active machines, broken down by platform type. This is posted monthly in sysd_stats.

Login Count

We currently receive a count of the number of logins, broken down by quickstations, cluster machines, other machines, and dialups. Unique login counts are also available. This is posted weekly in sysd_stats.

We also have data from a script which polls machines via busyd every 5 minutes and thus gets login session duration (accurate to +/- 5 minutes). That data lives on skywest in a SQL db.

Application Usage

For slw-wrapped applications, we have access to logs which tell us when an application was used, for how long, and the hostname of the machine on which it was run. The hostname of the machine can be used to determine whether or not it's a cluster machine. We know that certain wrapped applications provide incorrect duration information, but we can take that into account when compiling the data. Even without duration information, unique launches are still relevant.

For locally-installation applications, we have no technical solution in place. Work is ongoing (http://debathena.mit.edu/trac/ticket/340) to gather information on the applications used during the login session, but there are privacy concerns to be addressed.

(warning) Note that usage statistics will be skewed once fall term starts and Debathena is used in earnest. For example, users launching OpenOffice from the panel or by typing ooffice will get the locally installed version, and users explicitly using athrun or add -f will get the wrapped version in AFS.

Presentation

While data at the weekly level may be relevant, there are enough fluctuations that data should likely be presented at the monthly level, possibly with quarterly summaries. We should keep in mind that usage declines somewhat during IAP, and declines significantly in the Summer.

  • No labels