- 2006, 3, 2
-
Refactored ring package to separate application package with
more application of Groebner bases oriented classes.
The ring package could now be renamed to gb package.
Cosmetic and documentation improvements,
e.g. javadoc package descriptions and type parameter tags,
removed all tabs in all java files.
Implemented generic Quotient(Ring), Residue(Ring) and Local(Ring).
- 2006, 2, 27
-
Implemented parallel solvable Groebner base algorithms and tests.
New class distributed ThreadPool.
Cosmetic and code improvements spotted by eclipse, lint4j and jdepend.
Refactored module package to separate vector package with
basic linear algebra and list handling classes.
- 2006, 2, 19
-
Refactored to allow different Groebner base or reduction engines
where appropriate.
Split Syzygy etc. to an interface and a class.
Factored basic linear algebra out of Syzygy etc.
Adapted jython files to Jave code refactorings.
Reorganized jar files and documentation.
- 2006, 2, 12
-
Moved old examples to MAS subdirectory and new examples
to examples directory.
Implemented some right version algorithms using opposite rings.
- 2006, 2, 4
-
Switched to subversion from cvs.
Fixed bugs in new left syzygy algorithm.
- 2006, 1, 14-29
-
More documentation and cosmetic.
Implementation of an extended Groebner Base algorithm
and arbitrary base syzygy computation.
- 2006, 1, 6
-
GenPolynomialTokenizer enhanced to parse algebraic number
and Galois field coefficients.
Fixed an error in leftNormalform.
- 2005, 12, 30
-
New classes CriticalPair and CriticalPairList replace OrderedPairlist.
Reworked GB parallel and distributed versions to better respect
sequential version critical pair sequences.
Fixed some race conditions in parallel and distributed algorithms.
- 2005, 12, 28
-
Refactored all classes to remove static methods.
So to use any method at first an appropriate object is required.
Also class organization has changed to interfaces, abstract classes
and concrete classes, e.g. GroebnerBase, GroebnerBaseAbstract,
GroebnerBaseSeq, GroebnerBaseParallel and GroebnerBaseDistributed.
- 2005, 12, 27
-
Implemented new Ideal class with some ideal operations
like sum, product, intersection and quotient.
Added TermOrder extension and contraction.
- 2005, 11-12
-
Updated documentation comments.
- 2005, 7, 24
-
Updated old Java JDK 1.4 branch.
Bugfixes (in twoSidedGB), minor changes and cosmetic.
Updated documentation for new Java JDK 1.5 branch.
- 2005, 5-7
-
Working through all classes to introduce type parameters
and making all implied modifications.
- 2005, 5, 5
-
Switched to Java 1.5.
Now using covariant returns for implemented interfaces.
- 2005, 3, 25-29
-
Some module algorithms implemented.
Activated project web pages.
- 2005, 3, 12-19
-
Some Syzygy algorithms implemented.
Cosmetic on comments and forked web-log.
- 2005, 3, 5
-
For the python languege and interpreter exists also a Java
version named jython. This system can directly access Java classes
and execute Java methods.
Added some jython modules Ring, Ideal, SolvRing and SolvIdeal,
to access jas GB algorithms from jython.
- 2005, 2, 25-28
-
Penality of commutative GB computed as non-commutative left GB
is about 24% for (graded) Katsura 6 to 74% for (graded) Katsura 5.
Commutative GB computed as non-commutative twosided GB
is about a factor of 4 for (graded) Katsura 5 to a factor of 9
for (graded) Katsura 5.
Penality for weighted degree term order compated to graded Term order
is not measurable or sometimes better.
Fixed error in polynomial getONE() to correct term order.
Parser for non-commutative problems with relation tables
(but only commutative representations) and RunSGB main routine.
- 2005, 2, 14-21
-
New TermOrder with direct generation of Comparators for
ExpVector comparisons. Split term orders (elimination orders)
and weighted degree orders.
- 2005, 2, 4-8
-
New unit test case for TermOrder.
Fixed weak point in polynomial getONE() to correct number
of variables, e.g. correct exponent vector size.
Polynomial constant ONE still has wrong exponent vector size.
Deleted many old polynomial classes.
Implemented noncommutative product for solvable polynomials,
together with relation tables.
SolvableOrderedMapPolynomial extends OrderedMapPolynomial.
RatSolvableOrderedMapPolynomial extends SolvableOrderedMapPolynomial.
Interface SolvablePolynomial extends Ordered Polynomial.
Some more left multiplication methods, left reduction and
left Groebner Bases, twosided Groebner Base test and computation.
Pairlist class changed to avoid usage of criterion 4 if running
for SolvablePolynomials, also criterion4() method itself checks
for SolvablePolynomials.
RelationTable timings I, U(sl_3)
run / n |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
first |
3 |
13 |
32 |
92 |
128 |
188 |
274 |
420 |
683 |
1126 |
1795 |
2793 |
4380 |
6741 |
second |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
8 |
10 |
13 |
16 |
21 |
27 |
35 |
Timings in ms on AMD XP 2800+.
Computation of (Y^n) * (X^n) with respect to the relation
Y * X = X Y - H.
In the first run the relation table is populated with the
products Y^i * X^i, i = 2,...,n.
In the second run the relations are reused, showing almost
no computing time anymore for the products.
RelationTable timings II, U(sl_3)
run / n |
2 |
3 |
4 |
5 |
6 |
7 |
first |
28 |
94 |
303 |
1234 |
5185 |
24647 |
second |
1 |
12 |
107 |
782 |
4569 |
23897 |
Second example shows the computation of
( Xa + Xb + Xc + Ya + Yb + Yc + Ha + Hb )^n
in U(sl_3).
Since in the relation table only products of two variables
are stored, the improvement is minimal (for low n).
- 2005, 1, 24-31
-
Removed old deprecated classes.
Todo: let DistHashTable implement Map Interface.
Reimplemented Reduction.Normalform so that asynchronous
update of the polynomial list is possible and respected.
In case a new polynomial arrives, the reduction is restarted
from the beginning. Continuing with the done work and
rereducing so far irreducible terms would be an alternative.
Todo: use JDK 1.5 java.util.concurrent with interference free Lists,
BlockingQueues.
Distributed computation timings, Katsura 6 (G)
# Threads CPUs |
# JVMs |
time (sec) |
#put |
#remove |
% total |
1, seq |
1 |
160.2 |
70 |
327 |
13.5 |
1, par |
1 |
157.0 |
70 |
327 |
13.5 |
2, par |
1 |
82.2 |
72 |
329 |
12.7 |
1, dist |
1 |
177.2 |
77 |
334 |
11.4 |
2, dist |
2 |
92.2 |
90 |
347 |
8.6 |
4, dist |
2 |
56.2 |
112 |
369 |
5.9 |
8, dist |
2 |
58.9 |
255 |
516 |
1.5 |
4, dist |
4 |
51.2 |
117 |
374 |
5.5 |
6, dist |
4 |
43.7 |
129 |
386 |
4.6 |
8, dist |
4 |
62.9 |
259 |
519 |
1.5 |
Timings taken on a 16 CPU Intel Xeon SMP computer running
at 2.7 GHz and with 32 GB RAM.
JVM 1.4.2 started with AggressiveHeap and UseParallelGC.
#JVMs = number of distinct Java virtual machines.
#put = number of polynomials put to pair list.
#remove = number of pairs removed from pair list,
i.e. after application of criterions,
but including nulls up to now.
% total = per cent of removed pairs from total pairs generated,
#remove / ( #put * (#put-1) / 2 ) * 100.
Distributed computation timings, Katsura 7 (G)
# Threads CPUs |
# JVMs |
time (sec) |
#put |
#remove |
% total |
1, dist |
1 |
24726.2 |
140 |
781 |
8.0 |
2, dist |
2 |
12356.1 |
165 |
806 |
5.9 |
4, dist |
4 |
6859.3 |
218 |
859 |
3.6 |
8, dist |
4 |
7465.1 |
411 |
1054 |
1.2 |
8, dist |
8 |
6412.9 |
344 |
986 |
1.6 |
8, dist |
8 |
7173.3 |
399 |
1041 |
1.3 |
Overhead for distributed variant is about 10% in Katsura 6 (G).
Distributed 1 means one distributed process is running for the
reduction of S-polynomials. There is always a master process
handling polynomial input / output, setup and management of
distributed workers and handling of the pair list.
Communication between master and workers is always via TCP/IP
with object serialization, even if running on one computer.
- 2005, 1, 15-16
-
Further things to improve in GB algorithms (from tsp):
local work queues, i.e. local Pairlists,
improve data locality in polynomials and lists,
communication using message types,
reduce object serialization in DL-broadcast
by using MarshalledObjects,
reduce communication in Pair send by not sending polynomials
but indicies (requires distributed hashtable instead of DL),
interleave communication and computation by adding a
communication thread in the distributed client.
New classes implementing a distributed hash table to hold the
polynomials in distributed GB. Index of polynomials in Pairlist
is used as hash key.
Communication is now using message types GBTransportMess.
Now polynomials are only transported once to each reducer since
only polynomial hash indexes are transported.
Distributed list is asynchronous and late updated,
so some duplicate H-polynomials (head terms) could be (are) produced.
Solution by local put to hash table with dummy index?
Timings are not dramatically better.
Todo: check reduction algorithm to use later arriving polynomials.
- 2005, 1, 9-14
-
Introduced all direct improvements in util classes found so far.
(ChannelFactory and others)
On parallel computers the following JVM options (1.4.2)
must be used:
-Xms200M -Xmx400M -XX:+AggressiveHeap -XX:+UseParallelGC
Memory must be adjusted with respect to your situation.
Seperated versions with Pair Sequence Respecting Order (PS) and
normal versions. PS versions try to keep the order of
reduced polynomials added to the ideal base the same as in
the sequential version.
Normal versions now running OK on parallel computer with the
right JVM options.
Refactoring with Eclipse (organize imports, static methods).
Parallel computation timings, Katsura examples
# Threads CPUs |
Katsura 6 TO(G) load* |
Katsura 6 TO(G) empty* |
Katsura 7 TO(G) load* |
Katsura 7 TO(G) empty* |
seq |
184.5 |
153.3 |
|
|
1 |
181.5 |
4% / 159.7 |
|
28418.6 |
2 |
118.6 |
s2.02 / p2.11 / 75.6 |
|
p2.06 / 13760.0 |
4 |
76.8 |
s3.79 / p3.95 / 40.4 |
6256.9 |
p4.56 / 6225.1 |
8 |
43.2 |
s7.19 / p7.49 / 21.3 |
3240.7 |
p8.56/ 3318.8 |
10 |
42.5 |
|
|
|
12 |
40.5 |
|
2288.1 |
p9.90 / 2868.4 |
14 |
31.2 |
|
|
|
16 |
51.9 |
s8.19 / p8.54 / 18.7 |
5376.4 |
p12.59 / 2256.1 |
Timings taken on a 16 CPU Intel Xeon SMP computer running
at 2.7 GHz and with 32 GB RAM.
JVM 1.4.2 started with AggressiveHeap and UseParallelGC.
*) timing taken with other load on the CPUs.
+) timing taken with no other load on the CPUs.
Speedup: s = relative to sequential,
p = relative to parallel with one thread / CPU.
Scaling from 8 to 16 CPUs is bad, but also observed on
non CA / GB Examples (Java and C/FORTRAN).
- 2004, 10, 3
-
Generator for Katsura examples (from POSSO / FRISCO examples).
Timings on an AMD Athlon XP 2800 (2086MHz) with log level INFO.
Log level WARN would gain 10-20 %.
Timings Katsura examples
N vars = N+1 |
TermOrder |
Seconds |
TermOrder |
Seconds |
7 |
G |
32044.204 |
L |
|
6 |
G |
112.641 |
L |
|
5 |
G |
4.195 |
L |
|
4 |
G |
0.431 |
L |
11.650 |
3 |
G |
0.153 |
L |
0.310 |
2 |
G |
0.031 |
L |
0.032 |
- 2004, 9, 20-26
-
Changed OrderedPairlist to record the sequence of pairs taken from the list.
New methods
putParallel()
, removeParallel()
and
helper methods. Sequence numbers are generated and reduced polynomials
are only put to the pair list if corresponding pair number is in
correct (sequential) sequence.
The ordered list / queue pairsequence
(TreeMap/SortedMap)
keeps track of the polynomials not yet put to the pairlist.
Parallelism is possible as long there are pairs to be reduced.
But non zero reduced polynomials are only put into the pairlist if
the polynomial would be put into the pairlist in the sequential
Buchberger algorithm.
GroebnerBase algorithms had to be modified to record that a polynomial
reduced to zero.
- 2004, 9, 19
-
Changed order of Pair inserts for pairs with same LCM of HTs.
Added sequence number to each Pair and indicator if Pair did not
reduce to zero.
- 2004, September
-
Implemented Java server (
ExecutableServer
) for remote execution
of objects implementing the RemoteExecutable
interface.
New setup for the distributed computation of GBs:
the GB master now sends the client code to some ExecutableSevers
based on a maschine file with host and port infos about the
distributed environment.
Improved the PolynomialTokenizer so that it can read almost unedited
old MAS GB input files: **
exponents and parenthesis
around polynomials.
Lines starting with #
are treated as comments.
Comments (* *)
and parenthesis within polynomials are
still not supported.
Implemented a common driver for GB computations RunGB
.
Sequential, thread parallel and distributed computation can be selected
by command line parameters. The input is taken from a file. The number
of threads respectively the number of distributed clients can be specified.
For distributed execution the host and port information is taken from
a maschines file.
Usage: RunGB [seq|par|dist|cli] <file> #procs [machinefile]
Added methods putCount()
and remCount()
in OrderedPairlist
to count the number of polynomials
put and get from the pair data structure.
pairlist put and remove, trinks6
# Threads |
1 seq |
1 par |
2 par |
2 par |
3 par |
4 par |
5 par |
1 dist |
2 dist |
3 dist |
4 dist |
4 dist |
4 dist |
5 dist |
# put |
22 |
22 |
43 |
26 |
28 |
28 |
28 |
22 |
25 |
28 |
37 |
40 |
33 |
27 |
# remove |
25 |
25 |
61 |
30 |
32 |
32 |
41 |
26 |
33 |
42 |
47 |
61 |
54 |
69 |
Timings @ 500 MHz on one CPU and one maschine and log4j level INFO are:
ca. 2.5 - 3.5 seconds for sequential GB,
ca. 2.5 - 6 seconds for parallel GB,
ca. 5.5 - 9 seconds plus 5 seconds sync time for distributed GB.
Network shuffling of polynomials seems to account for 3 seconds in
this example.
Problem uncovered: the distributed version of GB needs an
avarage of 5 seconds to sync with all clients (on one maschine).
This is way to much for execution times in the range of 2 to 8 seconds.
Redesign of DistributedList, now using TreeMap to keep the list
entries in proper sequence. As key a natural number is used, which is
assigned by the server to successive add() requests.
The server now also holds a copy of the list by itself. So the
retransmission of list elements to late arriving clients is possible.
This doubles the space required to store polynomials, but removes
initial delays to sync all clients to receive all list elements.
By retransmission the DistributedList synchronization delay during
DistributedGB could be removed.
However the problem of about 5 seconds delay in startup
of DistributedGB still remains. It is not visible where and why this
delay occurs.
Further improvements would be the removal of list elements or the
clearing of the list.
Next steps could be distributed HashMap or TreeMap.
An important improvement would be to keep serialized copies of the
list elements (polynomials) at the server and to
avoid many time serialization during broadcast.
- 2004, 2, 26
-
Ideas to implement:
a distributed task launcher like
mpirun
,
a daemon to run on a distributed system to work with the launcher,
solvable polynomials and non-commutative GBs.
- 2004, 2, 1
-
Parallel computations with the Rose example are at ?? h with 3 threads
on 2 Pentium 4 @ 3.0 GHz hyperthreading CPUs.
With one thread the time is 30.6 h.
Besides the better CPU speed, this makes a 5 % improvement on JDK 1.4
compared to the older timings from a JDK 1.3
and the new polynomial implementation.
- 2004, 1, 25
-
Spin model checker:
Setup Promella description of parallel and distributed GB algorithm.
Used to verify the safety and liveness properties of the algorithms.
- 2004, 1, 17
-
New (minimal) class DistributedList in preparation of a
distributed GB algorithm.
Made all mutually transported objects Serializable.
Fixed ChannelFactory and other classes in edu.unima.ky.parallel
to be interuptable.
Moved Semaphore back to edu.unima.ky.parallel.
First version of a distributed memory GB.
One problem was inner class Pair results in serialization
of outer class OrderedPairlist, which is not intented.
So Pair is now a proper class.
Distributed version mainly working.
Problem with signaling and termination if 1 in GB.
- 2004, 1, 10
-
Refactored Groebner base for new OrderedPolynomials and
split sequential and parallel GB implementations.
New unit tests for sequential and parallel GBs.
GB now works for arbitrary Coefficients.
- 2004, 1, 5
-
New coefficient domains BigComplex and BigQuaternion.
New test unit for all coefficients.
Based on these coefficients are new polynomials, e.g.
ComplexOrderedMapPolynomial and QuatOrderedMapPolynomial
together with unit tests.
Problem: RatPolynomial requires DEFAULT_EVORD be set correctly in ExpVector.
This is now solved for the new OrderedMapPolynomials.
- 2003, 12, 29
-
New organization of polynomials:
2 interfaces: UnorderedPolynomial and OrderedPolynomial.
Ordered polynomials have a term order and are implemented using
some SortedMap (e.g. TreeMap).
Unodered polynomials are implemented using some HashMap (e.g. LinkedHashMap).
2 abstract classes: UnorderedMapPolynomial and OrderedMapPolynomial.
Both implement all algorithms which do not need an explicit coeficient domain,
i.e. they are formulated in terms of the Coefficient interface from jas.arith.
2 or more classes which extend the respective abstract classes:
RatUnorderedMapPolynomial and RatOrderedMapPolynomial both need explicitly
coefficients from BigRational or others.
Multiplication with ordered polynomials is about 8-10 times faster than
the multiplication with unordered polynomials. Also the multiplication with
semi-ordered polynomials (LinkedHashMap) with orderpreserving addition is
about 7-8 times slower than multiplication with ordered polynomials.
All implementations are based on Map interface and classes.
The Map maps exponent vectors (from some monoid) to coefficients (from some domain).
This is in sync with the mathematical definition of multivariate polynomials
as mappings from some monoid to some domain.
Term orders are represented by a TermOrder class which provides the
desired Comparator classes for the SortedMap implementation.
- 2003, 12, 28
- Things to do:
refactor and test HashPolynomial, check performance.
Improve the parallel GB algorithm, e.g. parallelize the reduction algorithm.
Develop a distributed memory version of the parallel GB algorithm.
- 2003, 12, 27
- Using ArgoUML to produce class diagramms for jas.
Refactoring GB from edu.jas.poly to edu.jas.ring.
- 2003, 12, 21
- Setup for JUnit version 3.8.1.
Added JUnit testing with Apache Ant version 1.6.0 to build.xml.
- 2003, September
-
Experimented with LinkedHashMap instead of TreeMap (SortedMap)
for the representation of polynomials.
Algorithms which work also with LinkedHashMap have 1 or 2 in their names
(now in new class HashPolynomial).
LinkedHashMap has the property of using the insertion order for
the iterator order.
With LinkedHashMap the add() and subtract() can be reformulated to
use a merging algorithm as in the original DIP implementation.
Assuming the Map insertion order is the the same as the polynomial
term order.
However the merging add/subtact implementation is a factor of
2 slower than the TreeMap implementation.
Complexity for a+b is
2*(length(a)+length(b))
for access and merging pre sorted polynomials and
2*length(a)+length(b)+length(b)*log2(length(a+b))
for TreeMap clone, access and insertion.
The merging multiplication implementation is by a factor of
10 slower than the TreeMap implementation.
Polynomial size was ~100 terms and the product contained ~8000 terms.
Complexity for a*b is
lab = length(a)*length(b) coefficient multiplications for both implementations
plus 2*length(a*b)*length(b) for merging summands, respectively
plus length(a)*length(b)*log2(length(a*b)) for TreeMap insertion.
Since for sparse polynomials length(a*b) = lab, the TreeMap complexity
is asymptotically better in this case:
2*length(a)*length(b)*length(b) =>= length(a)*length(b)*log2(length(a*b))
For dense polynomials with length(a*b) ~ length(a)[+length(b)], then
the LinkedHashMap complexity is asymptotically better:
2*length(a)*length(b) =<= length(a)*length(b)*log2(length(a*b))
- 2003, 3, 15
-
Some testing with new
AMD Athlon XP 2200+ @ 1.8 GHz, 133x2 MHz memory access.
Trinks 6: 1.013 sec with log4j level WARN, parallel 1.058 - 1.740 sec.
Trinks 7: 0.553 sec with log4j level WARN.
- 2003, 2, 2
-
Replacing all System.out.println by Log4j logging calls.
adding some Junit tests.
- 2003, 1, 28
-
Some testing with gcj (Gnu Java Compiler). this compiler gernerates
native executable binaries. the timings are not better than with
a jit, often worser.
Parallel computations with the Rose example are at 18 h with 4 threads
on 2 Pentium 4 @ 2.4 GHz hyperthreading CPUs. With one thread the time
is 40 h.
- 2003, 1, 26
-
timings with JDK 1.3 (from IBM) are 30% to 40% faster
than with JDK 1.4.1 (from Sun).
timings on PowerPC 604 @ 200MHz JDK 1.3 (from IBM)
JDK 1.2 (from Sun) are a factor 3.5-4.5 slower than on Intel PIII @ 450 MHz.
on PowerPC not using jit is faster than using a jit, ca. 20% - 30%.
- 2003, 1, 12
-
General differences between sequential and parallel GB algorithms
- The parallelization is achieved by doing the normal form reductions
in parallel.
- Since the reductions may require different times to complete
the normal forms may be entered into the list of polynomials
in different (time) order than in the sequential case.
So the pairs may be generated in different order.
However since the list of pairs is ordered wrt. the lcm of
the head terms this seems not to be a big issue.
- Since several reductions are scheduled in parallel
the pairs are taken from the list of pairs in different order
than in the sequential case.
One should not take more pairs from the list than can be reduced
in parallel.
Does this influence the validity of criterion 3?
Access to pairlist is synchronized.
Pairs are marked as reduced as soon they are taken from the list.
But the algorithm terminates only after all reductions of pairs have
terminated. So criterion 3 holds.
New implementation of parallel version of GB.
Removal of pairs is now also in parallel.
But ordering of pair insertion is no more preserved
Timings are more stable
and slightly better than that of sequential GB.
Todo: unit tests, comments, ...
- 2003, 1, 7-11
-
Renamed jas to mas2jas,
new cvs repository for jas, import and checkout.
Improved parallel version of GB.
- The sequence of polynomials added to the pair list is kept
stable. i.e. a pair scheduled for reduction at time t
will (if non zero) enter the list before any pair
scheduled at time t+x.
- This limits the parallelism since polynomials which reduce to zero
keep a reducer thread occupied until older polynomials are
finally reduced. One could eventualy hand over the blocking
object to the next thread.
- The number of pairs scheduled for reduction is now also limited
to the number of parallel reduction threads. this avoids
that to many pairs with high lcm will be scheduled before
eventually new created pairs with lower lcm.
- The limiting of scheduled pairs could also be implemented using
a BoundedBuffer/Queue for the ThreadPool workpile.
then addJob() would block until free Threads are avalilable.
- The sequencing and limiting could eventually also
achieved when the reducing threads take the pairs by
themselves instead of the master thread.
The main thread should then simply wait until all
threads are idle.
- The final reduction of the GB to a minimal GB is now
also parallelized.
Todo: CVS, comments,
polynomial implementation using LinkedList,
parallel GB simplify
With the improved algorithm the running time of the parallel GB
becomes more stable and not slower than the sequential GB.
However there is still no significant speedup.
parallel GB on one cpu
# Threads |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
16 |
# Reductions |
25 |
25 |
27 |
25 |
25 |
25 |
25 |
25 |
25 |
parallel GB on 4 cpus
# Threads |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
16 |
# Reductions |
22 |
24 |
30, 28, 24, 29 |
28 |
29 |
42 |
32 |
32 |
37 |
- 2003, 1, 6
- implemented parallel version of GB using ThreadPool from tnj
parallel results:
Trinks 7: mas 0.598 sec, jas 0.918 sec, jas par 0.955 sec
Trinks 6: mas 26.935 sec, jas 3.211 sec, jas par 3.656 sec
mas: including startup and gc time,
jas: excluding startup of jvm and including gc time,
jas par on single processor
timing on P-3@450
this make for a penality of 12 percent,
all tests with output to files,
- 2003, 1, 5
- timing benchmarks between BigRational and Coefficient versions
of the algorithms,
the difference seems to be less than 1 percent with
no clear advantage of one side
the sum and product algorithms have not jet optimal complexity,
sum has 2 l log(l) instead of 2 l
because of TreeMap.remove() and because of TreeMap.put(),
prod has l^2 log(l^2/2) instead of l^2 because of TreeMap.put()
TreeMap cannot used for this,
some kind of SortedLinkedList or SortedHashMap would be required,
the last would in turn require a hashValue() of an ExpVector
implemented edu.jas.arith.BigInteger which implements Coefficient,
tested with IntPolynomial which extends Polynomial
todo: alternative Implementations, cleanup RatPolynomial, parallel GB,
conversion RatPolynomial <--> IntPolynomial
- 2003, 1, 4
- added reading of PolynomialLists and Variable Lists to
PolynomialTokenizer,
implemented PolynomialList
refactoring RatPolynomial to extend Polynomial,
implementing IntPolynomial extends Polynomial
to make BigInteger implement Coefficient will require
a delegation extension of BigInteger
- 2003, 1, 3
- implemented PolynomialTokenizer to read RatPolynomials
- 2003, 1, 2
- file RatPolynomial split into RatGBase,
criterion 3 implemented with BitSet
second results (with new criterion 3 in jas):
Trinks 7: mas 0.598 sec, jas 1.159 sec
Trinks 6: mas 26.935 sec, jas 6.468 sec
mas: including startup and gc time,
jas: excluding startup of jvm and including gc time
implemented DIGBMI, H-polynomal was not monic in DIRPGB
third results (with new criterion 3 in jas and GBminimal):
Trinks 7: mas 0.598 sec, jas 0.918 sec
Trinks 6: mas 26.935 sec, jas 3.211 sec
mas: including startup and gc time,
jas: excluding startup of jvm and including gc time
timing on P-3@450
this makes for a factor of 8-9 better,
all tests with output to files,
startup of JVM is approx. 1.0-1.2 sec,
most time is spent in BigInteger:
java -Xrunhprof:cpu=times,format=a
CPU TIME (ms) BEGIN (total = 136) Thu Jan 2 18:33:53 2003
rank self accum count trace method
1 15,44% 15,44% 596610 21 java.math.MutableBigInteger.rightShift
2 13,24% 28,68% 582132 15 java.math.MutableBigInteger.difference
3 12,50% 41,18% 612760 19 java.math.MutableBigInteger.getLowestSetBit
4 9,56% 50,74% 2 9 java.lang.Object.wait
5 9,56% 60,29% 5271 22 java.math.MutableBigInteger.binaryGCD
6 6,62% 66,91% 612760 23 java.math.BigInteger.trailingZeroCnt
7 5,88% 72,79% 592152 18 java.math.BigInteger.bitLen
8 5,88% 78,68% 6018 20 java.math.MutableBigInteger.binaryGCD
9 5,15% 83,82% 578887 25 java.math.MutableBigInteger.normalize
10 4,41% 88,24% 550992 24 java.math.MutableBigInteger.primitiveRightShift
11 4,41% 92,65% 1 10 java.lang.Object.wait
12 3,68% 96,32% 582132 12 java.math.MutableBigInteger.compare
13 0,74% 97,06% 35965 13 edu.jas.poly.ExpVector.EVILCP
14 0,74% 97,79% 11612 14 java.math.BigInteger.divide
15 0,74% 98,53% 5866 11 java.math.MutableBigInteger.divide
16 0,74% 99,26% 9032 16 java.math.MutableBigInteger.divide
17 0,74% 100,00% 9032 17 java.math.BigInteger.divide
CPU TIME (ms) END
- 2003, 1, 1
- renaming packages to edu.jas,
renaming to arith, poly and ring,
new Makefile, project dokumentation in XHTML,
using JDK 1.4 with JIT
first results (without criterion 3 in jas):
Trinks 7: mas 0.598 sec, jas 1.373 sec
Trinks 6: mas 26.935 sec, jas 30.935 sec
mas: including startup and gc time,
jas: excluding startup of jvm and including gc time.
timing on P-3@450
- 2002, 12, 31
- starting with extraction of relevant files in new directory
- 2000, 7, 21
- keySet/get replaced by entrySet/getval.
Implemented S-Polynomial, normal form, irreducible set.
Implemented Groebner base with criterion 4. Criterion 3 left to to.
- 2000, 7, 16
- Implemented and tested BigRational based on Javas BigInteger.
With and without I/O BigRational addition, multiplication and
comparison is approx. 10 times faster than respective
SACRN functions.
Implemented and testet ExpVector based on Javas int arrays.
Implemented and testet RatPolynomial based on Javas TreeMap.
static methods: DIRPDF, DIRPWR via toString, DIRPON, DIRPMC,
DIRPPR, DIRPSM, DIRRAS.
Consider replacing keySet/get by entrySet/getval where appropriate.
Can there be an generic Polynomial class?
- 2000, 7, 15
- Some testing with Javas builtin BigInteger class.
Without I/O builtin multiplication is approx. 15 times faster
than SAC product.
With much I/O builtin multiplication is approx. 3 times faster
than SAC product.
Builtin uses also twos-complement representation, which is
bad for decimal printing.
This will be the end of list processing for Jas.
DIPRNGB needs DIRPDF, DIRPWR, DIRPON, DIRPMC, DIRPPR, DIRPSM.
These need RNDIF, RNDWR, RNINT, RNSIGN, ISRNONE, RNONE, RNZERO,
RNINV (and DIRPRP), RNPROD, RNSUM.
- 2000, 7, 14
- Class BigInteger based on SACI with toString().
Class List based on MASSTOR with toString().
Problem with Modula-2 module initialization with main(args)
method: initialization of static variables.
- 2000, 7, 5
- Finished testing most of masarith.
- 2000, 7, 2
- Finished porting masarith.
Testing needs still to be done. MASAPF is ok.
- 2000, 6, 24
- Perl script getc.pl to grab comments in Modula-2 source
and add to Java source.
Comments grabbed on most working files so far.
Generated new documentation.
- 2000, 6, 22
- Future directions:
- Parallel GB.
- Move to Java without Modula-2.
- Develop an applet version.
- Implement more of Mas.
- Replace SACI by Java BigInteger.
- Setup for real objects:
implement constructor,
implement methods: a.method(b) as STATIC_METHOD(a,b),
use real objects instead of only Cell objects
define interfaces e.g. for ring, polynomial ring,
module, group, field
- Small additions:
toString equivalents of xWRITEs
Problems identified:
- LISTs over Beta-Integers are diffrent to LISTs over LISTs,
when implementing LISTs as Java Types
LISTs over other types will require also special handling.
- 2000, 6, 22
- GB running on Trinks (small and big).
Jas ist about 5-8 times slower than MAS in GB computation.
Using JDK 1.1.7 without JIT on Linux, PII-300.
Using JDK 1.2.2-RC3 without JIT on Linux, PII-300 is
about 6-9 times slower.
Using JDK 1.2.2-RC4 with JIT (sunwjit) on Linux, PII-300 is
about 2-4 times slower.
Implemented Java input via BufferedReader.
- 2000, 6, 21
- Got GB running. Problem was in EQUAL.
- 2000, 6, 17
- Placed under control of CVS.
Begining with a clean version from Uni Passau.
Incorporated Java specific changes up to now.
- 2000, 6, 10
- Transformation of DIPRNGB finished.
Important parts of maspoly finished.
- 2000, 6, 4
- Transformation of SACPRIM finished.
Most of maskern finished.
Important parts of masarith finished.
- 2000, 5, 29
- MASSTOR: Mapping MAS list direkt to a Java list class and using
of the garbage collector from Java.
Data types LIST and GAMMAINT are now distinct.
Buying the MHC Compiler (UK Pound 59).
- 2000, 5, 28
- MASSTOR: First attempt to use list class with own garbage collection.
Using the constructor to record list pointers.
- 2000, 5, 27:
- Beginning of the first tests.
Conversion of .md to .def, .mi to .mod.
- 2000, 5, 26:
- Discovery of the MHC Modula-2 to Java compiler.
Mill Hill & Canterbury Corporation, Ltd, URL
http://www.mhccorp.com