2010-06-28:

Reporting on software downloads

There are two kinds of software downloads currently tracked in very different ways.

1. "Simple Downloads"

Running this url -- http://wserv.mit.edu/sc/download -- forces an output file with the current counter value for each file.  This is the counter value at the moment of the report.  So you have to be careful to run it at specific times if you want a certain date range to be represented.  The counter is an accumulator so you have to calculate the difference between now and the last time you ran the report to see how many downloads there were.  The worksheet attached to this page automates much of this.

  • run the url at monthly intervals to get a new list of files and counter values.
  • paste the contents of that output file at the bottom of the New Data worksheet, add the date to the third column, imitating what's already there; copying the formulas in columns four and five down for all the rows.  New Data has some pivot tables in it, at the top of worksheet, so you can easily see what's going on.
  • If there are filenames that resolve to "0" in the name lookup, copy the filename to the Lookup worksheet and assign a name; recalc and refresh pivot tables.  (If you want to exclude it from counting (like a text file or misspelled url) skip it.)
  • The Pivots worksheet shows the values per software title over time, and computes the difference from the first to the last in the table.  (You can adjust the number of months shown in the pivot table to compute across a subset of months.  For instance, to do a quarter only, turn off the wrong months, leaving just three visible; the Diff column will compute based on what it can see.)

2. Complex Downloads

Downloads.mit.edu is accessible only through VPN using SecureCRT/FX.

The downloads database is only accessible through mysql while logged on to that server. 

It is possible to formulate relatively simple sql scripts that extract useful fields from the download_log table and excerpt them into a file.  That file can then be ftp'd with SecureFX (launched from SecureCRT) out of downloads.mit.edu over to your own PC.

Once you're in, connect to the database with this:

mysql -udownloads -pdownloads download

The following query gets all the columns out of the database for all transactions between two dates, and streams it into a file for transferring elsewhere:

SYSTEM rm temp/downloads/2010-06.txt
TEE temp/downloads-2010-06.txt
SELECT download_uri, download_date, user_id, user_class, user_affiliation, ip_address, user_agent
FROM download_log
WHERE download_date BETWEEN
'2010-06-01' AND '2010-06-30'
;
NOTEE

This one will summarize the download counts per filename:with a within-the-month date stamp

TEE temp/downloads-summary-2010-06.txt
;
select download_uri, count(*), download_date from download_log
WHERE download_date BETWEEN
'2010-06-01' AND '2010-06-30'
group by download_uri
;
notee ;

It's probably important to note that closing a TEE file and reopening it APPENDs the new content.

  • No labels