排序方式: 共有137条查询结果,搜索用时 15 毫秒
1.
This paper focuses on the efficiency of finite discrete element method (FDEM) algorithmic procedures in massive computers and analyzes the time-consuming part of contact detection and interaction computations in the numerical solution. A detailed operable GPU parallel procedure was designed for the element node force calculation, contact detection, and contact interaction with thread allocation and data access based on the CUDA computing. The emphasis is on the parallel optimization of time-consuming contact detection based on load balance and GPU architecture. A CUDA FDEM parallel program was developed with the overall speedup ratio over 53 times after the fracture from the efficiency and fidelity performance test of models of in situ stress, UCS, and BD simulations in Intel i7-7700K CPU and the NVIDIA TITAN Z GPU. The CUDA FDEM parallel computing improves the computational efficiency significantly compared with the CPU-based ones with the same reliability, providing conditions for achieving larger-scale simulations of fracture. 相似文献
2.
3.
Investigation of highly efficient algorithms for solving linear equations in the discontinuous deformation analysis method
下载免费PDF全文
![点击此处可从《国际地质力学数值与分析法杂志》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Large‐scale engineering computing using the discontinuous deformation analysis (DDA) method is time‐consuming, which hinders the application of the DDA method. The simulation result of a typical numerical example indicates that the linear equation solver is a key factor that affects the efficiency of the DDA method. In this paper, highly efficient algorithms for solving linear equations are investigated, and two modifications of the DDA programme are presented. The first modification is a linear equation solver with high efficiency. The block Jacobi (BJ) iterative method and the block conjugate gradient with Jacobi pre‐processing (Jacobi‐PCG) iterative method are introduced, and the key operations are detailed, including the matrix‐vector product and the diagonal matrix inversion. Another modification consists of a parallel linear equation solver, which is separately constructed based on the multi‐thread and CPU‐GPU heterogeneous platforms with OpenMP and CUDA, respectively. The simulation results from several numerical examples using the modified DDA programme demonstrate that the Jacobi‐PCG is a better iterative method for large‐scale engineering computing and that adoptive parallel strategies can greatly enhance computational efficiency. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献
4.
3维数字地球快速缓冲区分析算法 总被引:1,自引:0,他引:1
提出一种应用在3维数字地球中的通过图形处理器(GPU)快速实现矢量数据缓冲区分析的算法。使用一张4通道的纹理图作为容器将地理实体的矢量数据传入GPU,利用GPU的高效并行特性,将目标缓冲区纹理中的每个像素所对应的矢量坐标与原实体进行距离量算,在一次渲染中得到缓冲区纹理,最后提取出缓冲区纹理的边界。选择中国的流域和湖泊矢量数据,将本文算法与两种传统的CPU算法进行了缓冲区分析计算、测试和对比。结果显示,本文算法相对于传统矢量算法效率提高了9—16倍,相对于传统栅格算法效率提高11—20倍。实验证明,该算法计算简单,效果明显,特别是随着数据量增大,缓冲区计算速度显著优于传统算法,并能有效解决传统矢量法缓冲区分析中的数据自相交问题。 相似文献
5.
6.
龚湜均 《测绘与空间地理信息》2015,(11)
随着社会的进步、时代的发展,人们越来越关注有关部门对应急事件的处理效率和办事效果。面对变化莫测的突发情况、纷繁复杂的任务要求,多部门联合办公成为解决应急事件的关键所在。利用Web GL技术直接调用计算机图形处理器(GPU)进行三维符号的快速绘制,可以实现各部门指挥系统符号的统一并实现跨平台使用,有效提高了有关部门对应急事件的处理效率和办事效果。 相似文献
7.
多属性融合技术是近年来针对单一属性的局限性而发展起来的一项新技术。通过研究三维空间中多种地震属性体实时融合技术,提出了一种实现方案。采用八叉树结构有效进行数据的动态管理,基于Shader编程技术,利用GPU可编程管线加速,实现多属性体融合的三维可视化。基于该方案,研发了三维可视化多属性体融合系统,实现了基于RGB映射和属性加权的两种融合技术,保证了多属性体高质量实时融合渲染,通过实际应用取得了良好效果。 相似文献
8.
基于GPU的地形可视化加速算法研究 总被引:1,自引:0,他引:1
地形可视化是利用数字高程模型DEM,采用计算机图形学和图像处理技术进行三维地形模拟显示。该技术在深部矿产预测、矿产资源评价、虚拟现实、娱乐游戏、飞行模拟等诸多领域有着广泛的应用。随着数据量的增大,三维地形可视化的实时、流畅视觉效果受到当前的计算机硬件技术水平限制。针对这一问题,本文运用ROAM算法进行地形建模,利用GPU高速并行运算性能加速地形可视化建模速度,加速模型显示效果。实验对比表明:当计算量比较小时,加速效果不显著;随着计算量的增大,计算效果越来越明显;当计算量达到一定值时,加速效果达到一个稳定的加速趋势。研究结果为地形可视化及矿产资源评价等类似工作提供了原创性可视化技术支撑。 相似文献
9.
伪谱和高阶有限差分混合方法, 在垂直方向采用交错网格有限差分算子, 利用其并行程度高的特点, 在水平方向采用伪谱算子, 保留其高精度的优势, 是计算地震波场的有效方法. 图形处理器(graphic processing unit, 简写为GPU) 由于其高度并行性, 在计算此类问题中有显著的优势. 由英伟达(NVIDIA)公司推出的统一计算设备架构(compute unified device architecture, 简写为CUDA)平台极大地简化了GPU编程的难度. 为提高计算效率, 本文实现了基于CUDA 平台的混合方法二维地震波场模拟. 然后基于二维均匀介质模型将CPU与GPU版本的运行时间进行对比. 实际测试结果表明, 基于CUDA 的并行模拟方法在保证计算精度的同时显著地提高了计算速度, 为开展大规模非均匀地球介质地震波传播数值模拟提供了一种可选的方法. 相似文献
10.
We have successfully ported an arbitrary high-order discontinuous Galerkin method for solving the three-dimensional isotropic elastic wave equation on unstructured tetrahedral meshes to multiple Graphic Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) of NVIDIA and Message Passing Interface (MPI) and obtained a speedup factor of about 28.3 for the single-precision version of our codes and a speedup factor of about 14.9 for the double-precision version. The GPU used in the comparisons is NVIDIA Tesla C2070 Fermi, and the CPU used is Intel Xeon W5660. To effectively overlap inter-process communication with computation, we separate the elements on each subdomain into inner and outer elements and complete the computation on outer elements and fill the MPI buffer first. While the MPI messages travel across the network, the GPU performs computation on inner elements, and all other calculations that do not use information of outer elements from neighboring subdomains. A significant portion of the speedup also comes from a customized matrix–matrix multiplication kernel, which is used extensively throughout our program. Preliminary performance analysis on our parallel GPU codes shows favorable strong and weak scalabilities. 相似文献