排序方式: 共有27条查询结果,搜索用时 0 毫秒
1.
The distinct lattice spring model (DLSM) is a newly developed numerical tool for modeling rock dynamics problems, i.e. dynamic failure and wave propagation. In this paper, parallelization of DLSM is presented. With the development of parallel computing technologies in both hardware and software, parallelization of a code is becoming easier than before. There are many available choices now. In this paper, Open Multi‐Processing (OpenMP) with multicore personal computer (PC) and message passing interface (MPI) with cluster are selected as the environments to parallelize DLSM. Performances of these parallel DLSM codes are tested on different computers. It is found that the parallel DLSM code with OpenMP can reach a maximum speed‐up of 4.68× on a quad‐core PC. The parallel DLSM code with MPI can achieve a speed‐up of 40.886× when 256 CPUs are used on a cluster. At the end of this paper, a high‐resolution model with four million particles, which is too big to handle by the serial code, is simulated by using the parallel DLSM code on a cluster. It is concluded that the parallelization of DLSM is successful. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献
2.
3.
???????????????????н?????????????????????е???????????漰??????????л?????????????????????????????????????λ????????С????????,?????OpenMP??MPI??????л??????????Ч??? 相似文献
4.
提出基于单观测值的Kalman滤波快速计算方法,并引入共享存储并行编程(OpenMP)技术实现协方差快速更新,从而实现非差GPS卫星钟差的快速实时计算。均匀选取55个IGS参考站,计算2017-03-20~03-30采样率为60 s的卫星钟差。与IGS事后30 s钟差相比,两者具有很好的一致性,RMS互差优于0.5 ns。选取未参与钟差解算的10个IGS参考站进行精密单点定位,结果表明,实时静态PPP水平方向精度优于2 cm,高程方向精度为2~4 cm;实时动态PPP水平方向精度为2~4 cm,高程方向精度为4~6 cm,能够满足实时PPP的精度要求。该方法在主频1.2 GHz服务器上8线程并行模式下单历元耗时4 s,相比串行模式效率提升1/3。 相似文献
5.
基于处理器制造工艺的提升接近极限,传统的单纯靠提高主频来提升性能已不适合时代需求,促使处理器从单核向多核转化。经过近年发展,多核处理器在当前成为主流配置,而气象程序大部分还是串行的,极大地浪费了处理器的计算资源。MPI和OpenMP作为两种主要的并行环境,具有各自的优势。MPI适用于分布式内存计算机,但是需要对程序进行的修改较多,难度大。OpenMP使用共享内存方式,对程序修改较少。相对来说,OpenMP更适合于多核处理器的并行计算。通过对CALMET进行OpenMP并行化加快CALMET运行速度的尝试,介绍了对串行程序进行OpenMP并行化的一般方法。主要步骤包括:对串行程序进行性能分析,找出计算时间最长的程序段进行并行改写;对循环进行OpenMP并行化,修改中间变量为单个线程私有;编译运行并行程序,进行性能比较;比较并行与串行的运行输出结果是否一致。 相似文献
6.
Large‐scale engineering computing using the discontinuous deformation analysis (DDA) method is time‐consuming, which hinders the application of the DDA method. The simulation result of a typical numerical example indicates that the linear equation solver is a key factor that affects the efficiency of the DDA method. In this paper, highly efficient algorithms for solving linear equations are investigated, and two modifications of the DDA programme are presented. The first modification is a linear equation solver with high efficiency. The block Jacobi (BJ) iterative method and the block conjugate gradient with Jacobi pre‐processing (Jacobi‐PCG) iterative method are introduced, and the key operations are detailed, including the matrix‐vector product and the diagonal matrix inversion. Another modification consists of a parallel linear equation solver, which is separately constructed based on the multi‐thread and CPU‐GPU heterogeneous platforms with OpenMP and CUDA, respectively. The simulation results from several numerical examples using the modified DDA programme demonstrate that the Jacobi‐PCG is a better iterative method for large‐scale engineering computing and that adoptive parallel strategies can greatly enhance computational efficiency. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
7.
8.
非连续变形分析(DDA)方法严格满足平衡要求和能量守恒,具有完全的运动学及数值可靠性,但对大规模岩土工程问题的数值模拟耗时太长,尤其是线性方程组求解,并行计算可以很好地解决该问题。首先基于DDA方法的基本理论,阐述了适用于DDA方法中的基于块的行压缩法和基于“试验-误差”迭代格式的非零位置记录;其次,引入块雅可比迭代法并行求解DDA方法的线性方程组,并改进了相应的非零存储方法;最后,基于OpenMP实现了DDA线性方程组求解并行计算,并将其应用于地下洞室群的破坏过程分析,以加速比为并行效率的指标评价,结果表明,该并行计算策略可以极大提高DDA的计算效率,而且适合各种规模的问题。 相似文献
9.
Guiming Zhang Qunying Huang John H. Keel 《International journal of geographical information science》2016,30(11):2230-2252
Performing point pattern analysis using Ripley’s K function on point events of large size is computationally intensive as it involves massive point-wise comparisons, time-consuming edge effect correction weights calculation, and a large number of simulations. This article presented two strategies to optimize the algorithm for point pattern analysis using Ripley’s K function and utilized cloud computing to further accelerate the optimized algorithm. The first optimization sorted the points on their x and y coordinates and thus narrowed the scope of searching for neighboring points down to a rectangular area around each point in estimating K function. Using the actual study area in computing edge effect correction weights is essential to estimate an unbiased K function, but is very computationally intensive if the study area is of complex shape. The second optimization reused the previously computed weights to avoid repeating expensive weights calculation. The optimized algorithm was then parallelized using Open Multi-Processing (OpenMP) and hybrid Message Passing Interface (MPI)/OpenMP on the cloud computing platform. Performance testing showed that the optimizations effectively accelerated point pattern analysis using K function by a factor of 8 using both the sequential version and the OpenMP-parallel version of the optimized algorithm. While the OpenMP-based parallelization achieved good scalability with respect to the number of CPU cores utilized and the problem size, the hybrid MPI/OpenMP-based parallelization significantly shortened the time for estimating K function and performing simulations by utilizing computing resources on multiple computing nodes. Computational challenge imposed by point pattern analysis tasks on point events of large size involving a large number of simulations can be addressed by utilizing elastic, distributed cloud resources. 相似文献
10.
针对现有Delaunay三角网购网方法研究的不足,文章提出一种基于并行计算的海量点云Delaunay方法:根据Delaunay分治构网的思想,将Delaunay构网分为数据分割、构建子网和子网合并3个步骤;设计了一种自适应的四叉树结构来分割和映射数据文件,并依据OpenMP并行标准中的Fork/Join并行模式,分层执行构网运算和合并运算;最后,使用一种改进的WFM-JLP调度算法来调度构网和合并运算以取得较好的负载均衡。实验证明:该方法能较好地降低算法的运行内存,减少运算时间。 相似文献