API Dokumentation mit javadoc.
Maximale Größe der Matrizen bei gegebenem Hauptspeicher.
MatMult hat 5 Matrizen, ein double hat 8 Byte
Maximales n bei 5 * 8 * n2 < MEMsize, also
n < sqrt( MEMsize / (5*8) )
bei MEMsize = 1 GB folgt n < 5100
bei MEMsize = 800 MB folgt n < 4500
bei MEMsize = 512 MB folgt n < 3500
bei MEMsize = 400 MB folgt n < 3200
bei MEMsize = 256 MB folgt n < 2500
Versuche den Block der Matrix B in den L2 Cache zu bringen.
Der B Block hat blocksize*n Einträge
plus eine Zeile von A und C, also 2 * n Einträge.
Wähle blocksize mit (blocksize + 2) * 8 * n < L2size, also
blocksize < L2size / (8*n) - 2
bei L2size = 1024 KB und n = 4000 folgt blocksize < 30
bei L2size = 1024 KB und n = 1000 folgt blocksize < 125
bei L2size = 512 KB und n = 4000 folgt blocksize < 14
bei L2size = 512 KB und n = 2500 folgt blocksize < 24
bei L2size = 512 KB und n = 1000 folgt blocksize < 63
bei L2size = 512 KB und n = 800 folgt blocksize < 78
bei L2size = 512 KB und n = 250 folgt blocksize < 260
bei L2size = 256 KB und n = 4000 folgt blocksize < 5
bei L2size = 256 KB und n = 1000 folgt blocksize < 30
bei L2size = 256 KB und n = 800 folgt blocksize < 38
bei L2size = 256 KB und n = 400 folgt blocksize < 78
Ergebnisse:
Rechner (CPUs) | MEMsize | n | L2size | block size | GHz | Zeit (sec) | Algorithmus (Threads) | JDK |
---|---|---|---|---|---|---|---|---|
BAcluster (2) | 900MB | 4000 | 512KB | 20 | 2,4 | 380,6 | SeqBlockTrans | 1.4.2 |
BAcluster (2) | 900MB | 4000 | 512KB | 10 | 2,4 | 191,3 | ParProcBlockTrans(4) | 1.4.2 |
BAcluster (2) | 900MB | 4000 | 512KB | 15 | 2,4 | 146,9 | ParProcBlockTrans(4) | 1.4.2 |
BAcluster (2) | 900MB | 4000 | 512KB | 15 | 2,4 | 180,2 | ParProcBlockTrans(2) | 1.4.2 |
BAcluster (2) | 900MB | 4000 | 512KB | 15 | 2,4 | 135,9 | ParProcBlockTrans(6) | 1.4.2 |
LaptopA (1) | 800MB | 4000 | 1MB | 20 | 1,4 | 312,0 | ParProcBlockTrans(4) | 1.5 |
LaptopA (1) | 800MB | 4000 | 1MB | 25 | 1,4 | 384,0 | ParProcBlockTrans(4) | 1.5 |
LaptopA (1) | 800MB | 4000 | 1MB | 20 | 1,4 | 500,0 | SeqBlockTrans | 1.5 |
LaptopA (1) | 800MB | 4000 | 1MB | 25 | 1,4 | 500,0 | SeqBlockTrans | 1.5 |
LaptopB (1) | 1000MB | 1100 | 512KB | 20 | 1,17 | 18,4 | ParProcBlockTrans(4) | 1.5 |
LaptopC (1) | 400MB | 1100 | 256KB | 20 | 1,67 | 16,6 | ParProcBlockTrans(4) | 1.4.2 |
LaptopC (1) | 400MB | 1100 | 256KB | 10 | 1,67 | 15,7 | ParProcBlockTrans(4) | 1.4.2 |
PcD (2ht) | 800MB | 4000 | 1MB | 26 | 3,0 | 333 | SeqBlockTrans | 1.5 |
PcD (2ht) | 800MB | 4000 | 1MB | 26 | 3,0 | 353 | ParProcBlockTrans | 1.5 |
PcD (2ht) | 800MB | 1100 | 1MB | 25 | 3,0 | 6,3 | SeqBlockTrans | 1.5 |
PcD (2ht) | 800MB | 1100 | 1MB | 25 | 3,0 | 4,0 | ParProcBlockTrans | 1.5 |
Vergleich mit C++ und Pthreads.
java -cp lib/jomp1.0b.jar:. jomp.compiler.Jomp jomp-file
javac -cp lib/jomp1.0b.jar:. java-file
java -cp lib/jomp1.0b.jar:. -Djomp.threads=n class-file
Using JOMP with Eclipse, copied from: http://www.lst.ethz.ch/teaching/lectures/ss10/24/assignments/assignment_10/eclipse.txt
1. Add MatrixMultiply.jomp to your project 2. Add a new class MatrixMultiply.java to your project (this file will be overwritten by jomp) 3. copy jomp1.0b.jar to your project directory 4. Preferences: Add a file association for *.jomp as java files (for syntax highlighting and auto completion) 5. Project properties: Add jomp1.0b.jar to your project's build path 6. Project properties: Add a new builder for jomp. - Main Tab: Location: Full path to java Working Directory: ${workspace_loc:/project_name} Arguments: -classpath jomp1.0b.jar jomp.compiler.Jomp src/MatrixMultiply - Refresh Tab: Check "Refresh resources upon completion" - Build Options Tab: Check "During auto builds" Check "Specify working set of relevant resources" Click "Specify Resources" and select MatrixMultiply.jomp Now jomp will be invoked each time you save MatrixMultiply.jomp. And syntax highlighting etc. also works.
javac -cp lib/mpp.jar:. java-file
Using MPJ Express with Eclipse, contained in: mpj-v0_38/doc/DebuggingWithEclipseIDE.pdf
1. prepare an Eclipse project 2. in project properties, "Java Build Path", "Libraries", add the mpj.jar from the lib path of MPJ Express 3. to run an MPJ Express program: in project properties, "Run/Debug Settings", in launch configurations select your MPJ program and edit launch configuration - in the "Environment" tab, "new" add "MPJ_HOME" and as value give the path to your MPJ installation directory - in the "Arguments" tab add "-jar ${MPJ_HOME}/lib/starter.jar" to VM arguments add "-np 8" to Program arguments, 8 is the number of MPJ processes - select a "variables" button, select "edit varibles", select "new" and add "MPJ_HOME" and as value give the path to your MPJ installation directory - run your MPJ program, the output will appear in the "Console" window
Using FastMPJ with Eclipse, contained in: http://torusware.com/extfiles/doc/fastmpj-documentation/userguide/UsersGuide.pdf
1. prepare an Eclipse project 2. in project properties, "Java Build Path", "Libraries", add the mpj.jar from the lib path of FastMPJ 3. to run a FastMPJ program: in "Run/External Tools/External Tools Configuration", add new "Program Configuration", sayFastMPJ
- in "Location", add the full path to thefmpjrun
script of your computer - in "Working Directory", add the relative path of your project, if your project is namedwithFastMPJ
then add ${workspace_loc:/withFastMPJ/bin} - in "Arguments", add the number of MPJ processes and the class name, for example, to request 4 processes and the current class name, add -np 4 -class ${java_type_name} - in the "Environment" tab, "new" add "FMPJ_HOME" and as value give the path to your FastMPJ installation directory - run your FastMPJ program, the output will appear in the "Console" window
Umfang der Implementierungen:
lines (factor) | bytes (factor) | Implementierung |
188 (1.0) | 4464 (1.0) | SeqByteTSP.java |
379 (2.0) | 9398 (2.1) | ParByteLocalTSP.java |
992 (5.3) | 26178 (5.9) | DistTSP.java |