Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


The mailing list is:


  1. Install new clean VM from ( Removed)/)
  2. Follow instalation steps from Graphite site
    1. upload PIN code (a tool for the dynamic instrumentation of programs from Intel) code. use version specified by Graphite-team even if it is not the latest one. Untar code and place it somewhere to stay permanently. I put it at ~/balewski/pin/
    2. Install the libraries needed for compiling (g++, make, etc…) using apt-get
    3. Install git-core. Although instruction says it is not essential, I could not compile the code on 32-bit architecture w/o it.
    4. Upload Graphite tarball from github, unpack, place in permanent directory
    5. Edit graphite/Makefile.config and set:
      1. PIN_HOME to point to your PIN-code
      2. change target architecture if you run it on 32-bit machine
    6. type make and wait ~5 minutes
  3. To test the code works execute:
      make cannon_app_test SIZE=4
    You should see the output like this
    Code Block
    balewski@debian5:~/graphite-2$ make cannon_app_test SIZE=4
    tests/unit/Makefile:17: warning: overriding commands for target `clean'
    tests/apps/Makefile:16: warning: ignoring old commands for target `clean'
    tests/benchmarks/Makefile:9: warning: overriding commands for target `clean'
    tests/unit/Makefile:17: warning: ignoring old commands for target `clean'
    Makefile:23: warning: overriding commands for target `clean'
    tests/benchmarks/Makefile:9: warning: ignoring old commands for target `clean'
    make -C /home/balewski/graphite-2/tests/apps//cannon
    make[1]: Entering directory `/home/balewski/graphite-2/tests/apps/cannon'
    /home/balewski/graphite-2/tests/apps/cannon/../../../common/Makefile.common:82: cannon.d: No such file or directory
    cc -MM -MG  -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/user -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/misc -c -Wall -O2 -m32 -DTARGET_IA32 -std=c99 cannon.c | sed -e 's,^\([^:]*\)\.o[ ]*:,./\1.o ./\1.d:,' >cannon.d
    make[1]: Leaving directory `/home/balewski/graphite-2/tests/apps/cannon'
    make[1]: Entering directory `/home/balewski/graphite-2/tests/apps/cannon'
    cc  -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/user -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/misc -c -Wall -O2 -m32 -DTARGET_IA32 -std=c99 -c -o cannon.o cannon.c
    cannon.c: In function ‘do_cannon’:
    cannon.c:79: warning: ‘matSize’ may be used uninitialized in this function
    make -C /home/balewski/graphite-2/tests/apps/cannon/../../../common
    make[2]: Entering directory `/home/balewski/graphite-2/common'
    make[2]: Nothing to be done for `all'.
    make[2]: Leaving directory `/home/balewski/graphite-2/common'
    make -C /home/balewski/graphite-2/tests/apps/cannon/../../../pin
    make[2]: Entering directory `/home/balewski/graphite-2/pin'
    make[2]: Nothing to be done for `all'.
    make[2]: Leaving directory `/home/balewski/graphite-2/pin'
    if [ ! -e cannon ] || [ cannon.o -nt cannon ] || [ /home/balewski/graphite-2/tests/apps/cannon/../../../lib/libcarbon_sim.a -nt cannon ] || [ /home/balewski/graphite-2/tests/apps/cannon/../../../lib/ -nt cannon ]; \
       then g++ cannon.o -o cannon -static -u CarbonStartSim -u CarbonStopSim -upthread_create -upthread_join -L/home/balewski/graphite-2/tests/apps/cannon/../../../lib -los-services -L /home/balewski/graphite-2/tests/apps/cannon/../../../os-services-25032-gcc.4.0.0-linux-ia32_intel64/ia32 -m32 -L/home/balewski/graphite-2/tests/apps/cannon/../../../contrib/orion -L/home/balewski/graphite-2/tests/apps/cannon/../../../lib -pthread -lcarbon_sim -los-services -lboost_filesystem-mt -lboost_system-mt -pthread -lorion; \
    /home/balewski/graphite-2/tests/apps/cannon/../../../lib/libcarbon_sim.a(socktransport.o): In function `SockTransport::Socket::connect(char const*, int)': warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
    cd /home/balewski/graphite-2/tests/apps/cannon/../../.. ; /home/balewski/graphite-2/tests/apps/cannon/../../../tools/ 1 /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg  /home/balewski/pin/ia32/bin/pinbin -mt -t /home/balewski/graphite-2/tests/apps/cannon/../../../lib/pin_sim -c /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg --general/total_cores=5 --general/num_processes=1 --general/enable_shared_mem=true  -- /home/balewski/graphite-2/tests/apps/cannon/cannon -t 4 -m 4
    [] 'GRAPHITE_ROOT' undefined. Setting 'GRAPHITE_ROOT' to '/home/balewski/graphite-2
    [] Starting process: 0 : export CARBON_PROCESS_INDEX=0; export LD_LIBRARY_PATH="/afs/csail/group/carbon/tools/boost_1_38_0/stage/lib"; /home/balewski/pin/ia32/bin/pinbin -mt -t /home/balewski/graphite-2/tests/apps/cannon/../../../lib/pin_sim -c /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg --general/total_cores=5 --general/num_processes=1 --general/enable_shared_mem=true -- /home/balewski/graphite-2/tests/apps/cannon/cannon -t 4 -m 4
    [[Graphite]] --> [ Core IDs' with memory controllers = (0 1 2 3 4 5 6 ) ]
    Starting iteration 0...
    Allocating and Initializing matrix a
    Allocating and Initializing matrix b
    Allocating and Initializing matrix c
    Initializing thread structures
    Thread 0 starting to retrieve initial data
    Thread 0 finished retrieving initial data, starting computation
    Thread 1 starting to retrieve initial data
    Thread 2 starting to retrieve initial data
    Thread 3 starting to retrieve initial data
    Thread 1 finished retrieving initial data, starting computation
    Thread 2 finished retrieving initial data, starting computation
    Thread 3 finished retrieving initial data, starting computation
    Done sending, waiting for worker threads to complete
    [] Exited with return code: 0
    Also the log file should show some performance benchmarks
    Code Block
    cat  output_files/sim.out
    Simulation timers:
    start time	3623124
    stop time	5712947
    shutdown time	6039413
                                          | Core 0          | Core 1          | Core 2          | Core 3          | Core 4          | TS 0            | MCP             |
    Core Performance Model Summary        |                 |                 |                 |                 |                 |                 |                 |
        Instructions                      |  15981          |  6188           |  6186           |  6156           |  6175           |  2754           |  0              |
        Completion Time                   |  123142         |  105450         |  104754         |  103939         |  101949         |  14449          |  0              |
        Average Frequency                 |  1              |  1              |  1              |  1              |  1              |  0              |  0              |
        Recv Instructions                 |  4              |  0              |  0              |  0              |  0              |  0              |  0              |
        Recv Instruction Costs            |  14336          |  0              |  0              |  0              |  0              |  0              |  0              |
        Sync Instructions                 |  3              |  9              |  8              |  7              |  8              |  0              |  0              |
        Sync Instruction Costs            |  54655          |  45850          |  42055          |  42596          |  42078          |  0              |  0              |
      Branch predictor stats              |                 |                 |                 |                 |                 |                 |                 |
        num correct                       |  1886           |  694            |  695            |  691            |  698            |  267            |  0              |
        num incorrect                     |  384            |  195            |  195            |  196            |  191            |  39             |  0              |
        type                              |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |
    Network summary                       |                 |                 |                 |                 |                 |                 |                 |
      Network model 0                     |                 |                 |                 |                 |                 |                 |                 |
        num packets received              |  8              |  17             |  17             |  17             |  17             |  0              |  131            |
        num bytes received                |  384            |  764            |  764            |  764            |  764            |  0              |  9856           |
        average latency (in clock cycles) |  9              |  8.47059        |  8.47059        |  9.52941        |  9.52941        |  nan            |  14.8244        |
        average latency (in ns)           |  9              |  8.47059        |  8.47059        |  9.52941        |  9.52941        |  nan            |  14.8244        |
        Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
        Dynamic Energy                    |  1.89929e-06    |  1.86787e-06    |  9.37239e-07    |  1.2594e-06     |  5.22373e-07    |  0              |  0              |
      Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
        Switch Allocator Traversals       |  153            |  122            |  64             |  85             |  39             |  0              |  0              |
        Crossbar Traversals               |  1167           |  1148           |  576            |  774            |  321            |  0              |  0              |
        Link Traversals                   |  1167           |  1148           |  576            |  774            |  321            |  0              |  0              |
      Network model 1                     |                 |                 |                 |                 |                 |                 |                 |
        num packets received              |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        num bytes received                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        average latency (in clock cycles) |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
        average latency (in ns)           |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
        Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
        Dynamic Energy                    |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
      Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
        Switch Allocator Traversals       |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Crossbar Traversals               |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Link Traversals                   |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
      Network model 2                     |                 |                 |                 |                 |                 |                 |                 |
        num packets received              |  345            |  368            |  827            |  384            |  332            |  188            |  138            |
        num bytes received                |  19257          |  20720          |  41499          |  21696          |  18892          |  8188           |  5962           |
        average latency (in clock cycles) |  7.47246        |  8.29076        |  6.14994        |  7.75           |  7.9006         |  8.58511        |  10.9275        |
        average latency (in ns)           |  7.47246        |  8.29076        |  6.14994        |  7.75           |  7.9006         |  8.58511        |  10.9275        |
        Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
        Dynamic Energy                    |  6.8307e-06     |  5.40692e-06    |  8.50509e-06    |  4.03165e-06    |  3.80035e-06    |  4.47535e-06    |  4.02241e-06    |
      Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
        Switch Allocator Traversals       |  462            |  458            |  674            |  345            |  315            |  251            |  280            |
        Crossbar Traversals               |  4198           |  3322           |  5226           |  2477           |  2335           |  2751           |  2472           |
        Link Traversals                   |  4198           |  3322           |  5226           |  2477           |  2335           |  2751           |  2472           |
      Network model 3                     |                 |                 |                 |                 |                 |                 |                 |
        num packets received              |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        num bytes received                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        average latency (in clock cycles) |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
        average latency (in ns)           |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
        Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
        Dynamic Energy                    |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
      Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
        Switch Allocator Traversals       |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Crossbar Traversals               |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Link Traversals                   |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
      Network model 4                     |                 |                 |                 |                 |                 |                 |                 |
        num packets received              |  1              |  1              |  1              |  1              |  1              |  0              |  0              |
        num bytes received                |  40             |  40             |  40             |  40             |  40             |  0              |  0              |
    Shmem Perf Model summary              |                 |                 |                 |                 |                 |                 |                 |
        num memory accesses               |  8036           |  2950           |  2950           |  2937           |  2945           |  1487           |  0              |
        average memory access latency     |  4.75075        |  8.31458        |  9.35627        |  8.93667        |  8.39728        |  7.86685        |  nan            |
    Cache Summary                         |                 |                 |                 |                 |                 |                 |                 |
      Cache L1-I                          |                 |                 |                 |                 |                 |                 |                 |
        num cache accesses                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        miss rate                         |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
        num cache misses                  |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
      Cache L1-D                          |                 |                 |                 |                 |                 |                 |                 |
        num cache accesses                |  8066           |  2954           |  2954           |  2941           |  2949           |  1568           |  0              |
        miss rate                         |  1.46293        |  3.65606        |  4.40081        |  3.94424        |  3.62835        |  8.09949        |  nan            |
        num cache misses                  |  118            |  108            |  130            |  116            |  107            |  127            |  0              |
      Cache L2                            |                 |                 |                 |                 |                 |                 |                 |
        num cache accesses                |  118            |  108            |  130            |  116            |  107            |  127            |  0              |
        miss rate                         |  74.5763        |  100            |  100            |  100            |  100            |  100            |  nan            |
        num cache misses                  |  88             |  108            |  130            |  116            |  107            |  127            |  0              |
    Dram Perf Model summary               |                 |                 |                 |                 |                 |                 |                 |
        num dram accesses                 |  80             |  68             |  183            |  57             |  46             |  87             |  58             |
        average dram access latency       |  113            |  113.074        |  113.076        |  113            |  113            |  113            |  113.276        |
        average dram queueing delay       |  0              |  0.0735294      |  0.0765027      |  0              |  0              |  0              |  0.275862       |
      Queue Model                         |                 |                 |                 |                 |                 |                 |                 |
        Queue Utilization(%)              |  0.920492       |  0.728351       |  2.14126        |  0.654045       |  0.528698       |  0.996801       |  0.665631       |
        Analytical Model Used(%)          |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Dram Directory Cache                  |                 |                 |                 |                 |                 |                 |                 |
        Total Addresses                   |  1170           |  1169           |  1166           |  1169           |  1170           |  1174           |  1173           |
        Average set size                  |  1.14258        |  1.1416         |  1.13867        |  1.1416         |  1.14258        |  1.14648        |  1.14551        |
        Set index with max size           |  1022           |  1023           |  1017           |  1022           |  995            |  994            |  1014           |
        Max set size                      |  6              |  6              |  5              |  5              |  5              |  5              |  5              |
        Set index with min size           |  543            |  543            |  543            |  544            |  543            |  543            |  543            |
        Min set size                      |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Average evictions per set         |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Set index with max evictions      |  -1             |  -1             |  -1             |  -1             |  -1             |  -1             |  -1             |
        Max set evictions                 |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
        Address with max evictions        |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |
        Max address evictions             |  0              |  0              |  0              |  0              |  0              |  0              |  0              |