Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
Wiki Markup
{viewpdf:name=

...

Graphite_Day-Intro_and_Arch.pdf

...

}
The main page

...

 is

[http://wiki.github.com/mit-carbon/Graphite/|http://wiki.github.com/mit-carbon/Graphite/

...

gettingstarted]

The mailing list is: graphite-sim@googlegroups.com

...



Here are my steps do generate fully functional Ubuntu Debian w/ Graphite.

...


# Install new clean VM from ([http://www.debian.org/CD/netinst/])

...


# Follow instalation steps from Graphite site

...


## upload PIN code (a tool for the dynamic instrumentation of programs from Intel) code. use version specified by Graphite-team even if it is not the latest one. Untar code and place it somewhere to stay permanently. I put it at \~/balewski/pin/

...


## Install the libraries needed for compiling (g++, make, etc…) using apt-

...

get
## Install git-core. Although instruction says it is not essential, I could not compile the code on 32-bit architecture w/o it.

...


## Upload Graphite tarball from github, unpack, place in permanent directory

...


## Edit graphite/Makefile.config and set:

...


### PIN_HOME to point to your PIN-code

...


### change target architecture if you run it on 32-bit machine

...


## type make and wait \~5 minutes

...


# To test the code works execute:

...


  make cannon_app_test SIZE=4

...


You should see the output like this

...


{code

...

}
balewski@debian5:~/graphite-2$ make cannon_app_test SIZE=4
tests/unit/Makefile:17: warning: overriding commands for target `clean'
tests/apps/Makefile:16: warning: ignoring old commands for target `clean'
tests/benchmarks/Makefile:9: warning: overriding commands for target `clean'
tests/unit/Makefile:17: warning: ignoring old commands for target `clean'
Makefile:23: warning: overriding commands for target `clean'
tests/benchmarks/Makefile:9: warning: ignoring old commands for target `clean'
make -C /home/balewski/graphite-2/tests/apps//cannon
make[1]: Entering directory `/home/balewski/graphite-2/tests/apps/cannon'
/home/balewski/graphite-2/tests/apps/cannon/../../../common/Makefile.common:82: cannon.d: No such file or directory
cc -MM -MG  -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/user -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/misc -c -Wall -O2 -m32 -DTARGET_IA32 -std=c99 cannon.c | sed -e 's,^\([^:]*\)\.o[ ]*:,./\1.o ./\1.d:,' >cannon.d
make[1]: Leaving directory `/home/balewski/graphite-2/tests/apps/cannon'
make[1]: Entering directory `/home/balewski/graphite-2/tests/apps/cannon'
cc  -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/user -I/home/balewski/graphite-2/tests/apps/cannon/../../../common/misc -c -Wall -O2 -m32 -DTARGET_IA32 -std=c99 -c -o cannon.o cannon.c
cannon.c: In function ‘do_cannon’:
cannon.c:79: warning: ‘matSize’ may be used uninitialized in this function
make -C /home/balewski/graphite-2/tests/apps/cannon/../../../common
make[2]: Entering directory `/home/balewski/graphite-2/common'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/balewski/graphite-2/common'
make -C /home/balewski/graphite-2/tests/apps/cannon/../../../pin
make[2]: Entering directory `/home/balewski/graphite-2/pin'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/balewski/graphite-2/pin'
if [ ! -e cannon ] || [ cannon.o -nt cannon ] || [ /home/balewski/graphite-2/tests/apps/cannon/../../../lib/libcarbon_sim.a -nt cannon ] || [ /home/balewski/graphite-2/tests/apps/cannon/../../../lib/pin_sim.so -nt cannon ]; \
   then g++ cannon.o -o cannon -static -u CarbonStartSim -u CarbonStopSim -upthread_create -upthread_join -L/home/balewski/graphite-2/tests/apps/cannon/../../../lib -los-services -L /home/balewski/graphite-2/tests/apps/cannon/../../../os-services-25032-gcc.4.0.0-linux-ia32_intel64/ia32 -m32 -L/home/balewski/graphite-2/tests/apps/cannon/../../../contrib/orion -L/home/balewski/graphite-2/tests/apps/cannon/../../../lib -pthread -lcarbon_sim -los-services -lboost_filesystem-mt -lboost_system-mt -pthread -lorion; \
	fi
/home/balewski/graphite-2/tests/apps/cannon/../../../lib/libcarbon_sim.a(socktransport.o): In function `SockTransport::Socket::connect(char const*, int)':
socktransport.cc:(.text+0x1a0c): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
cd /home/balewski/graphite-2/tests/apps/cannon/../../.. ; /home/balewski/graphite-2/tests/apps/cannon/../../../tools/spawn.py 1 /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg  /home/balewski/pin/ia32/bin/pinbin -mt -t /home/balewski/graphite-2/tests/apps/cannon/../../../lib/pin_sim -c /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg --general/total_cores=5 --general/num_processes=1 --general/enable_shared_mem=true  -- /home/balewski/graphite-2/tests/apps/cannon/cannon -t 4 -m 4
[spawn.py] 'GRAPHITE_ROOT' undefined. Setting 'GRAPHITE_ROOT' to '/home/balewski/graphite-2
[spawn.py] Starting process: 0 : export CARBON_PROCESS_INDEX=0; export LD_LIBRARY_PATH="/afs/csail/group/carbon/tools/boost_1_38_0/stage/lib"; /home/balewski/pin/ia32/bin/pinbin -mt -t /home/balewski/graphite-2/tests/apps/cannon/../../../lib/pin_sim -c /home/balewski/graphite-2/tests/apps/cannon/../../../carbon_sim.cfg --general/total_cores=5 --general/num_processes=1 --general/enable_shared_mem=true -- /home/balewski/graphite-2/tests/apps/cannon/cannon -t 4 -m 4

[[Graphite]] --> [ Core IDs' with memory controllers = (0 1 2 3 4 5 6 ) ]
Starting iteration 0...
Allocating and Initializing matrix a
Allocating and Initializing matrix b
Allocating and Initializing matrix c
Initializing thread structures
Thread 0 starting to retrieve initial data
Thread 0 finished retrieving initial data, starting computation
Thread 1 starting to retrieve initial data
Thread 2 starting to retrieve initial data
Thread 3 starting to retrieve initial data
Thread 1 finished retrieving initial data, starting computation
Thread 2 finished retrieving initial data, starting computation
Thread 3 finished retrieving initial data, starting computation
Done sending, waiting for worker threads to complete
24.000000
Exiting...
[spawn.py] Exited with return code: 0
{code}

...

Also the log file should show some performance benchmarks

...


{code

...

}cat  output_files/sim.out
Simulation timers:
start time	3623124
stop time	5712947
shutdown time	6039413
                                      | Core 0          | Core 1          | Core 2          | Core 3          | Core 4          | TS 0            | MCP             |
Core Performance Model Summary        |                 |                 |                 |                 |                 |                 |                 |
    Instructions                      |  15981          |  6188           |  6186           |  6156           |  6175           |  2754           |  0              |
    Completion Time                   |  123142         |  105450         |  104754         |  103939         |  101949         |  14449          |  0              |
    Average Frequency                 |  1              |  1              |  1              |  1              |  1              |  0              |  0              |
    Recv Instructions                 |  4              |  0              |  0              |  0              |  0              |  0              |  0              |
    Recv Instruction Costs            |  14336          |  0              |  0              |  0              |  0              |  0              |  0              |
    Sync Instructions                 |  3              |  9              |  8              |  7              |  8              |  0              |  0              |
    Sync Instruction Costs            |  54655          |  45850          |  42055          |  42596          |  42078          |  0              |  0              |
  Branch predictor stats              |                 |                 |                 |                 |                 |                 |                 |
    num correct                       |  1886           |  694            |  695            |  691            |  698            |  267            |  0              |
    num incorrect                     |  384            |  195            |  195            |  196            |  191            |  39             |  0              |
    type                              |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |  one-bit (1024) |
Network summary                       |                 |                 |                 |                 |                 |                 |                 |
  Network model 0                     |                 |                 |                 |                 |                 |                 |                 |
    num packets received              |  8              |  17             |  17             |  17             |  17             |  0              |  131            |
    num bytes received                |  384            |  764            |  764            |  764            |  764            |  0              |  9856           |
    average latency (in clock cycles) |  9              |  8.47059        |  8.47059        |  9.52941        |  9.52941        |  nan            |  14.8244        |
    average latency (in ns)           |  9              |  8.47059        |  8.47059        |  9.52941        |  9.52941        |  nan            |  14.8244        |
    Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
    Dynamic Energy                    |  1.89929e-06    |  1.86787e-06    |  9.37239e-07    |  1.2594e-06     |  5.22373e-07    |  0              |  0              |
  Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
    Switch Allocator Traversals       |  153            |  122            |  64             |  85             |  39             |  0              |  0              |
    Crossbar Traversals               |  1167           |  1148           |  576            |  774            |  321            |  0              |  0              |
    Link Traversals                   |  1167           |  1148           |  576            |  774            |  321            |  0              |  0              |
  Network model 1                     |                 |                 |                 |                 |                 |                 |                 |
    num packets received              |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    num bytes received                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    average latency (in clock cycles) |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
    average latency (in ns)           |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
    Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
    Dynamic Energy                    |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
  Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
    Switch Allocator Traversals       |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Crossbar Traversals               |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Link Traversals                   |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
  Network model 2                     |                 |                 |                 |                 |                 |                 |                 |
    num packets received              |  345            |  368            |  827            |  384            |  332            |  188            |  138            |
    num bytes received                |  19257          |  20720          |  41499          |  21696          |  18892          |  8188           |  5962           |
    average latency (in clock cycles) |  7.47246        |  8.29076        |  6.14994        |  7.75           |  7.9006         |  8.58511        |  10.9275        |
    average latency (in ns)           |  7.47246        |  8.29076        |  6.14994        |  7.75           |  7.9006         |  8.58511        |  10.9275        |
    Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
    Dynamic Energy                    |  6.8307e-06     |  5.40692e-06    |  8.50509e-06    |  4.03165e-06    |  3.80035e-06    |  4.47535e-06    |  4.02241e-06    |
  Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
    Switch Allocator Traversals       |  462            |  458            |  674            |  345            |  315            |  251            |  280            |
    Crossbar Traversals               |  4198           |  3322           |  5226           |  2477           |  2335           |  2751           |  2472           |
    Link Traversals                   |  4198           |  3322           |  5226           |  2477           |  2335           |  2751           |  2472           |
  Network model 3                     |                 |                 |                 |                 |                 |                 |                 |
    num packets received              |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    num bytes received                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    average latency (in clock cycles) |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
    average latency (in ns)           |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
    Static Power                      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |  0.0176561      |
    Dynamic Energy                    |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
  Activity Counters                   |                 |                 |                 |                 |                 |                 |                 |
    Switch Allocator Traversals       |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Crossbar Traversals               |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Link Traversals                   |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
  Network model 4                     |                 |                 |                 |                 |                 |                 |                 |
    num packets received              |  1              |  1              |  1              |  1              |  1              |  0              |  0              |
    num bytes received                |  40             |  40             |  40             |  40             |  40             |  0              |  0              |
Shmem Perf Model summary              |                 |                 |                 |                 |                 |                 |                 |
    num memory accesses               |  8036           |  2950           |  2950           |  2937           |  2945           |  1487           |  0              |
    average memory access latency     |  4.75075        |  8.31458        |  9.35627        |  8.93667        |  8.39728        |  7.86685        |  nan            |
Cache Summary                         |                 |                 |                 |                 |                 |                 |                 |
  Cache L1-I                          |                 |                 |                 |                 |                 |                 |                 |
    num cache accesses                |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    miss rate                         |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |  nan            |
    num cache misses                  |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
  Cache L1-D                          |                 |                 |                 |                 |                 |                 |                 |
    num cache accesses                |  8066           |  2954           |  2954           |  2941           |  2949           |  1568           |  0              |
    miss rate                         |  1.46293        |  3.65606        |  4.40081        |  3.94424        |  3.62835        |  8.09949        |  nan            |
    num cache misses                  |  118            |  108            |  130            |  116            |  107            |  127            |  0              |
  Cache L2                            |                 |                 |                 |                 |                 |                 |                 |
    num cache accesses                |  118            |  108            |  130            |  116            |  107            |  127            |  0              |
    miss rate                         |  74.5763        |  100            |  100            |  100            |  100            |  100            |  nan            |
    num cache misses                  |  88             |  108            |  130            |  116            |  107            |  127            |  0              |
Dram Perf Model summary               |                 |                 |                 |                 |                 |                 |                 |
    num dram accesses                 |  80             |  68             |  183            |  57             |  46             |  87             |  58             |
    average dram access latency       |  113            |  113.074        |  113.076        |  113            |  113            |  113            |  113.276        |
    average dram queueing delay       |  0              |  0.0735294      |  0.0765027      |  0              |  0              |  0              |  0.275862       |
  Queue Model                         |                 |                 |                 |                 |                 |                 |                 |
    Queue Utilization(%)              |  0.920492       |  0.728351       |  2.14126        |  0.654045       |  0.528698       |  0.996801       |  0.665631       |
    Analytical Model Used(%)          |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
Dram Directory Cache                  |                 |                 |                 |                 |                 |                 |                 |
    Total Addresses                   |  1170           |  1169           |  1166           |  1169           |  1170           |  1174           |  1173           |
    Average set size                  |  1.14258        |  1.1416         |  1.13867        |  1.1416         |  1.14258        |  1.14648        |  1.14551        |
    Set index with max size           |  1022           |  1023           |  1017           |  1022           |  995            |  994            |  1014           |
    Max set size                      |  6              |  6              |  5              |  5              |  5              |  5              |  5              |
    Set index with min size           |  543            |  543            |  543            |  544            |  543            |  543            |  543            |
    Min set size                      |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Average evictions per set         |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Set index with max evictions      |  -1             |  -1             |  -1             |  -1             |  -1             |  -1             |  -1             |
    Max set evictions                 |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
    Address with max evictions        |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |  0xffffffff     |
    Max address evictions             |  0              |  0              |  0              |  0              |  0              |  0              |  0              |
balewski@debian5:~/graphite-2$


{code}