| Parallel
Computing
The three major concerns
of industrial CAE users are accuracy, computation speed and
user-friendliness. True 3D simulation not only satisfies those
concerns, but offers more advantages that conventional 2.5D
cannot reach, such as CAD integration, accuracy, minimized
model simplification...etc. However, true 3D simulation inevitably
increases computation time and requests more memory. Although
the High-Performance Finite Volume Method, HPFVM, employed
by Moldex3D/Solid has already outperformed other 3D software,
but users are still eagerly expecting significant improvements.
For optical parts,
fiber-reinforced automobile components, connectors,
gears...etc., the demand for high accuracy and high speed
computation can never be underestimated. Computation speed
can be improved by newer and more powerful CPU. However,
improvement from CPU clock rate alone can not satisfy
industrial users in speed nor accuracy. Utilization of
multiple CPU is therefore the most effective approach.
Parallel computing is fast
becoming an inexpensive alternative to standard supercomputer
for solving large scale problems that arise in scientific
and engineering applications. Generally, there are two types
of parallel computing platform: (1)Symmetric Multiple Processor,
SMP, and (2) Massively Parallel Processing, MPP. The CPUs
of a SMP platform share the same memory and are controlled
by a single operating system. Common SMP computers are supercomputer,
dual CPU server, 4-CPU server...etc. A lot of industries have
already used SMP computers as servers for their simplicity
and easy maintenance. MPP platform consists of multiple computers.
Each computer has its own CPU and operating system. The communication
and collaboration between computers are done through high-speed
network (Myrinet or Gigabit ethernet) and certain message
interfaces. Common MPP platforms consist of a cluster of standard
PC or high-end workstation. By utilizing standard PCs, PC
cluster generally offers the best cost/performance. The cost
in operation system can be even lower if Linux rather than
Windows is employed. Supercomputer has a lot of advantages,
but it's also too expensive to afford for most industies.
PC cluster has the advantages of high performance and low-cost,
so it's usually referred as "the Poor Man's Supercomputer"
Among the 500 most powerful computer systems in the world,
many are PC clusters. (www.top500.org)
Now Moldex3D/Solid takes
the lead in parallel computing to enhance the computation
performance and perform analyses in less time on more
complex model with larger element count than ever. The
high-performance parallelized kernel is equipped integrated
analyses of Flow, Pack, Cool, Warp, Fiber Orientation and
RIM (Reactive Injection Molding). Furthermore,
Moldex3D/Solid parallel computing technology supports both
Multi-CPU platform and PC-Cluster.
The most common SMP
computers are dual CPU desktop PC, which is only 20~35% more
expensive than single CPU one. With dual-CPU power, the
computation speed can typically have 60% to 75%
acceleration. Tetra (4) and Octa (8) CPU high-end servers
are also available from major computer suppliers. Users can
gain significant performance enhancement by slightly
increase their hardware investment. Besides, PC cluster is
proven to be much more impressive in terms of
cost-performance. In general, 4-node cluster is about
250~300% faster than a single CPU PC. Twenty times speedup
is possible by using a 32-node PC cluster.
Performance
Benchmark
The following table demonstrates the parallel computing performance
of Moldex3D/Solid-Flow with various models. The testing platform
is Dual Intel Xeon 2.4GHz CPU and Windows XP Professional.
For Moldex3D/Solid-Flow, although speedup partly depends on
the geometry, it can typically have 60% to 75% acceleration.
Flow analysis is usually the most time-consuming step in all
analyses. This improvement implies a big reduction in overall
analysis time.
 |
|
 |
| |
| Platform:Dual Intel
Xeron 2.4GHz CPU with 2 GB RAM |
| System:Windows XP
Professional |
| Solver:Moldex3D/Solid-Flow
R7.0 |
| Case# |
Elements |
1 CPU(sec) |
2 CPU(sec) |
*Speed Up |
| Case1 |
7,586 |
33 |
23 |
1.43 |
| Case2 |
113,978 |
8,400 |
5,300 |
1.58 |
| Case3 |
560,716 |
74,100 |
42,300 |
1.75 |
| |
Note:Speed UP:Time(1
CPU)/Time(2 CPU) |
|
|
 |
|
 |
The following table shows the parallel
computing performance of Moldex3D/Solid-Warp. Generally, it
has 70% - 88% acceleration.
 |
|
 |
| |
| Platform:Dual Intel
Xeron 2.4GHz CPU with 2 GB RAM |
| System:Windows XP
Professional |
| Solver:Moldex3D/Solid-Flow
R7.0 |
| Case# |
Elements |
Nodes |
1
CPU(sec) |
2
CPU(sec) |
Speed Up |
|
Case1 |
12,500 |
15,000 |
119 |
64 |
1.86 |
Case2 |
125,421 |
37,026 |
71 |
41 |
1.73 |
|
Case3 |
128,456 |
49,258 |
98 |
54 |
1.81 |
Case4 |
571,392 |
107,516 |
521 |
277 |
1.88 |
|
Case5 |
1,233,203 |
291,784 |
2,721 |
1,688 |
1.61 |
| |
Note:Speed UP:Time(1
CPU)/Time(2 CPU) |
|
|
 |
|
 |
Except to accelerate the
computation speed, another big advantage of parallel
computing is the capability to deal with the huge models.
Current 32-bit CPU can only address up to 4GB RAM. Excluding
the memory reserved for Windows itself, an application
program can only access up to 3GB memory. Therefore, memory
may not be enough for some big scale problems. Moldex3D
parallel computing makes that it is possible for 32-bit CPU
to calculate bigger models than ever. For example, the
following table demonstrates the application of Dual Xeon
CPU to a huge model with 3.2 million elements. Although the
machine is equipped with 8GB RAM, memory can not be applied
enough when it is on a single CPU due to the limitation of
CPU itself. Hence, using Moldex3D parallel computing, it
comes up to pass this limitation.
 |
|
 |
| |
|
CPU |
Intel Xeron
2.8GHz |
| Mesh |
3.2 Million
elements |
| CPU
number |
1 |
2 |
| RAM |
8.0GB |
|
Operating System |
Windows 2003
Server Enterprise 32 bit |
|
Solid/Flow R7.0 |
Memory Not Enough |
24.4 hour |
|
|
 |
|
 |
 |
System
Requirements : |
The parallel computing of Moldex3D requires distributed systems with high performance interconnects. The amount of data passed between processors in a typical Moldex3D run could be in hundreds to thousands megabytes. The followings are the basic system requirements.
Managing Node:
1. Microsoft Windows 2003 server recommended.
2. Intel Pentium, Intel Xeon, Intel EM64T, AMD Athlon, or AMD Opteron based processor.
3. 2.0 GB RAM or greater.
4. Gigabit Ethernet or greater
Computing Node:
1. Microsoft Windows XP Professional, Windows XP x64, or Windows 2000 recommended.
2. Intel Pentium, Intel Xeon, Intel EM64T, AMD Athlon, or AMD Opteron based processor.
3. 1.0 GB RAM or greater.
4. Gigabit Ethernet or greater.
Network:
1. Gigabit switch or greater.
 |