We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.

Graphical abstract

A schematic illustration of the flow of the seismic wave propagation simulation using OpenSWPC.

Of the wide variety of numerical methods used to simulate seismic wave propagation, the staggered-grid finite difference method (FDM) has been widely used in the seismological community. The staggered-grid FDM with second-order accuracy in space and time to simulate seismic waves was first proposed by Madariaga (1976); then, it was expanded to fourth-order accuracy in space by Levander (1988). The 3D FDM simulation of seismic wave propagation at regional scales has been conducted after Olsen et al. (1995) and Graves (1996). Since then, the FDM simulations have been widely used for the numerical modeling of seismic wave propagation (see Moczo et al. 2014 and references therein) due to the relative simplicity of its numerical algorithm as well as its computational efficiency in parallel computing. Multiple available FDM codes, such as GMS (Aoi and Fujiwara 1999; Aoi et al. 2004), FDMPI (Bohlen 2002), AWP-ODC (Cui et al. 2013), FDSim (Moczo et al. 2014) and SW4 (Petersson and Sjogreen 2014), have been used to numerically model seismic wave propagation. Many other available community codes including other numerical methods such as the spectral-element method are described in Igel (2016).

In this study, we developed a new staggered-grid FDM code to model seismic waves in 3D and 2D viscoelastic media in local-to-regional scale. By improving the usability of the code, called the Open-source Seismic Wave Propagation Code (OpenSWPC), the numerical simulation with parallel computing of a wide variety of targets is made accessible to users. The original version of this code, designed to be implemented on supercomputers (Furumura and Chen 2005; Inoue et al. 2013), was fully restructured to have excellent performance on a variety of computer architectures to be fully open to the seismological community.

The development of OpenSWPC was in particular intended to improve its usability for non-experts who are not so familiar with numerical simulation techniques. For this purpose, OpenSWPC combines the necessary pre- and post-processing functions in its main simulation code (Fig. 1), and all the necessity information for the simulations are defined in input text parameter files. Therefore, users do not need to modify the code to apply it to individual simulation targets. We adopt de facto standard formats for the binary input and output datasets; the velocity structure input model is given in the Network Common Data Form (NetCDF; Rew and Davis 1990), and the simulation results are output in the NetCDF and the Seismic Analysis Code (SAC; Goldstein et al. 2003; Helffrich et al. 2013) formats. Following the input parameters, OpenSWPC dynamically allocates the computer memory, automatically generates grids for a heterogeneous velocity model and performs parallel numerical simulations of seismic wave propagation.

Fig. 1

A schematic illustration of the flow of the wave propagation simulation using OpenSWPC

In the following, we briefly review the strategy of the FDM simulation of seismic wave propagation in heterogeneous viscoelastic media with frequency-independent attenuation and the numerical techniques adopted in the present code. Then, examples of numerical simulations are demonstrated to show how the present code works.

In this section, we briefly introduce the staggered-grid FDM method for simulating seismic waves in viscoelastic media and the algorithms adopted in OpenSWPC. The details of theories for viscoelasticity and numerical algorithms can be found in Moczo et al. (2014) and Igel (2016).

Seismic wave modeling with a viscoelastic body

We solve the equation of motion of continuum mechanics described by the velocity and stress components in the Cartesian coordinate system such that

where \(N_{{\rm D}}\) is the model dimension (\(N_{{\rm D}} =2\) or 3), \(v_i\) is the particle velocity of the elastic motion in the ith component, \(\rho\) is the density, \(\sigma _{ij}\) is the i, jth component of the stress tensor and \(f_i\) is the ith component of the body force. In the following, we primarily describe the three-dimensional case, \(N_{{\rm D}}=3\); however, it is straightforward to reproduce the two-dimensional P-SV or SH formulae from this case.

The stress is related to the particle velocity via the constitutive equation. In this study, we adopted the Generalized Zener Body (GZB) (e.g., JafarGandomi and Takenaka 2007; Maeda et al. 2013) of the viscoelasticity, which is expressed as a parallel connection of Zener bodies (e.g., Aki and Richards 2002, Chapter 5) having different physical constants. Note that the GZB is known to be equivalent to the generalized Maxwell body (Moczo and Kristek 2005; Cao and Yin 2014), which is also widely used for seismic wave modeling (e.g., Emmerich and Korn 1987).

Following Robertsson et al. (1994), the constitutive equation of the viscoelastic body is written as

where \(\epsilon _{ij}\) is the i, jth component of the strain tensor, \(\dot{\psi }_\pi \left( t \right)\) and \(\dot{\psi }_\mu \left( t \right)\) are the time derivatives of the relaxation functions for two independent relaxed moduli \(\pi _{R} \equiv \lambda _{R} + \mu _{R}\) and \(\mu _{R}\), respectively. For the GZB, they are represented using the L different relaxation times \(\tau _\ell ^\sigma\) (\(\ell =1,\ldots ,L\)) and the creep times of the P- and S-waves (\(\tau _\ell ^{\epsilon P}, \tau _\ell ^{\epsilon S},\) respectively) such that

As a consequence of introducing the viscoelasticity, the phase speeds of the body waves become a function of the frequency due to the physical dispersion effect (Aki and Richards 2002). Therefore, they follow the given velocity structure at a given reference frequency \(f_{R}\).

In the actual Earth medium, \(Q_{P}\) and \(Q_{S}\) do not significantly depend on frequency over a wide frequency range below \(\sim\)1 Hz. To yield an approximately constant Q over a frequency range, we introduced memory variables following Robertsson et al. (1994) with the \(\tau\)-method proposed by Blanch et al. (1995). The \(\tau\)-method assumes that the ratios between the relaxation and creep times are constant among all Zener bodies (\(\ell =1,\ldots ,L\)):

From a given set of relaxation times, the \(\tau\)-method gives optimized values for the parameters \(\tau ^P\) and \(\tau ^S\) using the least squares method, so that the attenuation \(Q_{P}\) and \(Q_{S}\) become approximately constant over a given frequency range. The constitutive equation based on this \(\tau\)-method is written as follows:

The selection of the relaxation times, \(\tau _\ell ^\sigma\), is not a trivial issue in the \(\tau\)-method. We adopted logarithmically spaced relaxation times in the given frequency range. As in an example in Fig. 2, the adopted model can yield an approximately constant Q over two orders of the frequency range below 2 Hz using three Zener bodies (\(L=3\)). Frequency range of the approximately constant Q is adjustable by input parameter of the code.

Fig. 2

An example of a nearly constant Q model using three Zener bodies (bold black curve), considering a constant Q (=100) in the frequency range between 0.02 and 2 Hz. Dotted lines show the contributions from three Zener bodies

Discretization

The equation of motion, Eq. (1), and the constitutive equations, Eqs. (5) and (6), are solved numerically based on a staggered-grid FDM with fourth-order accuracy in space and second-order accuracy in time (e.g., Levander 1988). We adopt the Cartesian coordinate system with \(x_1=x\) and \(x_2=y\) as the horizontal direction and vertical axis \(x_3=z\) was taken to be positive downward with average sea level height at \(z=0\). Figure 3 shows the layout of the staggered-grid adopted in this study.

Fig. 3

Layout of the three-dimensional Cartesian coordinate staggered-grid adopted in this study

We adopted the first-order Crank–Nicolson method to calculate the time integration of the constitutive equation, Eq. (5), and the auxiliary equation for the memory variables, Eq. (6). This enables us to solve the original implicit scheme equations explicitly. The discretized formula can be found in Maeda et al. (2013).

The evaluation of the medium properties defined in the staggered-grid system requires appropriate averaging of the medium parameters defined on the neighboring grids. In this code, all medium properties (relaxed medium parameters \(\lambda _{R}\) and \(\mu _{R}\), density \(\rho\), and attenuation parameters \(\tau ^P\) and \(\tau ^S\)) are defined on the same grid as the normal stress components (Fig. 3). Averaging of the density on neighboring grids is necessary when evaluating the velocity, Eq. (1), and averaging the relaxed rigidity modulus is necessary when evaluating the shear stress components and the accompanying memory variables (Eqs. 5, 6). For the averaging, we adopted an arithmetic and a harmonic averaging for the density and the relaxed rigidity modulus, respectively. Notice that no averaging is necessary for \(\lambda _{R}\) because it is used only for updating the normal stress components.

Boundary conditions

The free surface and ocean bottom boundary conditions are implemented in the FDM simulation based on the efficient Heterogeneity, Oceanic layer and Topography (HOT) FDM method (Nakamura et al. 2012). This method was originally developed for a second-order FDM method to include the topographic variations of volcanoes (Ohminato and Chouet 1997). Later, it was shown that this condition could be applied effectively for fluid–solid boundaries (Okamoto and Takenaka 2005). Currently, HOT-FDM is widely applied in regional-scale modeling of seismic wave propagation in coastal areas (e.g., Maeda and Furumura 2013; Maeda et al. 2013, 2014; Nakamura et al. 2012, 2015; Noguchi et al. 2016; Todoriki et al. 2017), in high-frequency seismic wave simulations of topographic scattering (Takemura et al. 2015), and in the synthesis of elastic wave propagation in cylinder-shaped materials on the scale of 10 cm (Yoshimitsu et al. 2016), demonstrating the ability of the HOT-FDM to include irregular topography and curved surfaces in FDM.

In this HOT-FDM method, the air column is treated as a medium with a very small density and zero wavespeeds for the P- and S-waves as \(\alpha _{R} = \beta _{R} = 0\) [km/s]. This column is treated as a vacuum where no seismic wave can propagate due to zero wavespeeds. The ocean column is treated as an elastic medium having \(\rho = 1.0\) [\({\text{g/cm}}^3\)], \(\alpha _{R} = 1.5\) [km/s] and \(\beta _{R} = 0.0\) [km/s]. On the free surface and seafloor, a reduced, second-order FDM scheme is applied instead of the fourth-order FDM. To apply such a reduced-order FDM across the boundaries, the location of the boundaries are detected automatically in OpenSWPC by searching for the grid positions where \(\mu _{R}\) and \(\lambda _{R}\) change from zero to a finite value.

Except in whole-Earth seismic wave modeling, an appropriate absorbing boundary condition surrounding the bounded model is necessary to avoid artificial reflections from the boundaries of the computational model. We adopt the Perfectly Matched Layer (PML) boundary condition (after Chew and Liu 2011) to minimize artificial reflections. Up to now, various PML methods have been developed (for example, Kristek et al. 2009 and references therein); we adopted an implementation proposed by Zhang and Shen (2010) to consider its effectiveness in and applicability to OpenSWPC.

This PML method solves the auxiliary differential equations in addition to the equations of motion and the constitutive equations in the absorbing zone with complex and frequency-shifted absorbing functions. Typically 10–20 grids are used for the thickness of the PML zone surrounding the model; this can be defined in the input parameters. To avoid large computational loads when solving the auxiliary equations of the PML, we assumed that the material in the PML zone was perfectly elastic and did not calculate the memory variables (Eq. 6), for the anelasticity.

Even though the PML efficiently absorbs outgoing waves from the model, it occasionally causes severe instabilities during the calculation, in particular when the seismic waves propagate in highly heterogeneous structures with very large velocity contrasts (Maeda et al. 2013). Therefore, we also implemented the stable sponge boundary condition of Cerjan et al. (1985), which simply attenuates the waves in the absorbing layer by multiplying small values to the stress and velocity components in the absorbing zone at each time step so that users can select the appropriate boundary condition. Note that the sponge condition is perfectly stable but less efficient when absorbing outgoing waves.

Seismic source representation

The seismic moment tensor source can be implemented in the FDM either by couples of body forces (Graves 1996) or based on a stress discontinuity representation (Coutant et al. 1995; Pitarka 1999). In OpenSWPC, the stress discontinuity representation was adopted to implement the moment tensor sources. The seismic source can be implemented in OpenSWPC with a single point source. A finite-fault source can be represented using multiple point sources (Graves and Wald 2001; Takenaka and Fujii 2008). In this code, there is no restriction on the number of point sources; the code automatically detects the number of sources and allocates the required memory. We also implemented a single-force source that is widely used for modeling seismic signals excited in volcanic environments (e.g., Ohminato et al. 1998). Both moment tensor and single-force sources are placed on the nearest grid point of the normal stress component. This source grid point is averaged with the neighboring source grid points if necessary when updating the shear stress and velocity components (Coutant et al. 1995).

The seismic source time function is represented by bell-like functions (Fig. 4) of various forms such as boxcar, triangle and cosine functions, Herrmann’s quadratic function (Herrmann 1979) and Küpper’s wavelet of a single cycle (Mavroeidis and Papageorgiou 2003). They have a common cutoff frequency, which is a reciprocal of the source duration time, but have different roll-offs above the cutoff frequency.

Fig. 4

Implemented source time functions and their amplitude spectra: a boxcar function, b triangle function, c Herrmann function, d cosine function, e Küpper’s wavelet and f\(t \exp [-t]\) function. In each panel, the left figure shows the temporal variation of the source time function and the right figure shows the amplitude spectrum. The horizontal axes are normalized by the rise time \(T_{R}\)

The incidence of plane P- or S-waves with near-vertical angles was also implemented in OpenSWPC. The plane-wave incidence uses a solution of the up-going wave of the 1D wave equation in a homogeneous medium as an initial condition of the velocity and stress components at the bottom of the model (e.g., Emoto et al. 2010). A bell-like-shaped spatial distribution corresponding to the source time function (Fig. 4) is assumed as initial conditions. Even though the OpenSWPC allows a non-vertical incidence angle to be set for the plane wave, note that artificial reflections may arise at both sides of the plane wave traveling in the absorbing boundaries with increasing plane-wave incidence angles.

Simulations exchanging source and station positions are sometimes conducted because the reciprocity theorem (e.g., Aki and Richards 2002) implies the equivalence of both simulation results under certain conditions. The use of the reciprocity theorem is very effective to obtain a large number of synthetic seismograms for a small number of stations when there is a very large number of source grids, e.g., when calculating Green’s function with reduced computational cost (e.g., Graves and Wald 2001; Eisner and Clayton 2001; Zhao et al. 2006; Hsieh et al. 2016; Petukhin et al. 2016). OpenSWPC is equipped with this feature. In this mode, a single-force source is placed at a specified station grid to export displacement waveforms and spatial derivatives of displacement at the multiple source grid locations. Using a source having much shorter source time functions compared to the dominant period of the simulation target, the simulation results can be used as Green’s functions.

Parallel computing

All codes in OpenSWPC were written following to the standards of Fortran 2003 (e.g., Adams et al. 2008) with very few compiler-unique functions; therefore, this code should work in a variety of computational environments without modification. This code uses the external libraries of Message Passing Interface (MPI) and NetCDF for parallel computing and data I/O, respectively. Note that these external libraries also work in many environments.

In OpenSWPC, all definitions such as parallelization strategies, model size, discretization and output filenames are defined in the input parameter files. Because the computational memory is allocated dynamically, users do not need to re-compile the code to perform simulations with different model sizes. OpenSWPC adopts a domain-partitioning procedure for the parallel computing using MPI. It performs 2D and 1D model partitioning in the horizontal directions for the 3D and 2D codes, respectively (see Fig. 5 for the partitioning for the 3D code). Then, OpenSWPC sets up the velocity structure model for the partitioned domain on each CPU or CPU core, and calculates the seismic wave propagation in the domain using MPI data communication at each time step.

Fig. 5

a A schematic illustration of the domain partition model for MPI data communication in a 3D simulation. b A schematic illustration of the data communication strategy using MPI.

(Reproduced from figure B1 of Maeda et al. (2013) with modification. Copyright by the Seismological Society of America)

To maintain wavefield continuities across the partitioned domain, the velocity and stress components defined at the two-point-thick outermost layer are exchanged with those of the neighboring partitioned domains at every time step using MPI (Fig. 5b). We adopted hybrid-style programming using OpenMP and MPI (e.g., Furumura and Chen 2005) to make full use of many CPUs with multiple CPU cores; directive-based OpenMP parallel programming is applied to parallelize the computation among the CPU cores, and the MPI data communication between the CPUs is performed explicitly. Compared to the flat-MPI application which explicitly exchanges data for all cores, this hierarchical OpenMP/MPI parallel programming method is expected to reduce the data communication overhead (Furumura and Chen 2005).

The simulations are performed with a mix of single- and double-floating-point-precision arithmetic. Static parameters such as the medium velocities or density are defined in single precision; however, the FDM calculation and the stress discontinuity at the sources are evaluated more accurately in double precision. We found that simulations in single precision often lead to numerical instability after long-time step calculations, as demonstrated in Fig. 6. The panels in Fig. 6a show snapshots of the particle velocity amplitude in the seismic wavefield obtained by a 2D P-SV numerical simulation using single-precision arithmetic. After a long elapsed time, it was confirmed that a randomly oscillating noise was continuously being radiated from the source location. Such unstable noise is suspected to occur due to the wide dynamic range of the wavefield amplitude near the seismic source, which cannot be handled accurately using single-precision arithmetic. The accumulation of this small error over long time steps may cause problems or instabilities in the simulation. The use of double-precision arithmetic can effectively avoid this problem (Fig. 6b); however, it leads to a considerable increase in the memory usage and computation time. OpenSWPC has an option to evaluate all calculations using single-precision arithmetic to reduce the computational cost.

Fig. 6

A comparison of simulations based on a single-precision arithmetic calculations and b mixed-precision calculations. In each frame, the absolute amplitude of the horizontal components of the velocity at the elapsed times of 50 and 500 s are shown in the color scales. Circles show the location of the point source, which is an isotropic radiation of seismic waves

Model input

For the 3D simulation of seismic wave propagation, we consider a layered structure model formed by a set of velocity layers with varying depths. Each layer describes the medium discontinuities with the definitions of medium parameters of density, P- and S-wave velocities, and attenuation (\(Q_{P}, Q_{S}\)) beneath the layer. The depth of the shallow-most layer corresponds to the topography. The topography is treated as a staircase boundary with a sufficiently small FDM grid. We assume that \(z=0\) on the depth axis corresponds to the average sea level; the topography deeper than zero is treated as the seafloor, and the seawater column is filled between the seafloor and \(z=0\).

We adopted NetCDF as the input data format for each layer. The NetCDF format is commonly used in the seismological community, such as in Generic Mapping Tools (GMT; Wessel et al. 2013). In OpenSWPC, a set of NetCDF files and a list of these files associated with the medium parameters are used as input to OpenSWPC. Each NetCDF file is defined in geographical coordinates (longitude and latitude) and contains the depth information at each coordinate location.

Even though the input NetCDF files describe the depths of the boundary at the locations defined by the geographical coordinates, the FDM simulation model needs the depths of the boundary at grids in the Cartesian coordinate system. Therefore, a coordinate transform is necessary to model seismic waves in the Cartesian coordinate system. For this purpose, a coordinate transform based on a power-series expression of the Gauss–Krüger transform (Kawase 2011) is embedded in OpenSWPC. Users are requested to input the center of the simulation model in longitude and latitude for the coordinate transform, and the OpenSWPC automatically generates the velocity structure model in the Cartesian coordinate system. Because the grid locations in a Cartesian coordinate system are usually not identical to the grid locations of the input NetCDF files, a bicubic interpolation is applied to obtain the depth of the boundary.

The layered velocity structure model can be superimposed by a random velocity fluctuation described by statistical characteristics such as the Gaussian or von Kármán power spectrum density functions (Sato et al. 2012), to simulate the scattering of high-frequency seismic waves in heterogeneous structures (e.g., Furumura and Kennett 2005; Takemura et al. 2015, 2016). The random velocity perturbation is calculated in wavenumber space based on the statistical characteristics of the random medium (Klimeš 2002; Sato et al. 2012). Utility programs to calculate these random velocity fluctuations in 2D and 3D are provided in OpenSWPC. The utility programs provide the random velocity fluctuation data in the NetCDF format, which can be used in OpenSWPC. By customizing the model generation subroutine in OpenSWPC, it is easy to implement another type of velocity model in OpenSWPC if necessary.

Simulation data output

The simulation results are exported as two types of datasets: waveforms at specified station points and 2D snapshots of the seismic wavefield on 2D horizontal and vertical cross sections. For the waveforms, velocity and displacement traces from specified station locations (in either geographical or Cartesian coordinates) are exported in the SAC data format. Because the FDM simulations are performed with much smaller time steps compared to the dominant period of the seismic waves, a temporal decimation is often applied to reduce the data size. The decimation factor is specified in the input parameter file. Note that the integration of velocity waveforms with respect to time to obtain displacement records is performed before the decimation.

The snapshots of the wavefield are stored in NetCDF format files. These cross sections can be taken vertically or horizontally, or along the topography or bathymetry. The snapshots contain the three-component velocity or the displacement motions, or divergence and rotation of the velocity, which are related to the P- and S-waves.

Checkpoint and restart

Most computer systems restrict the maximum computer time for a single simulation job via their job queuing systems. To cope with this restriction, OpenSWPC is equipped with a checkpoint–restart function that halts the simulation at a specified checkpoint prior to the allowed CPU time and exports all data on memory to files. Then, a new job restarts the simulation from the checkpoint by reading the stored data files of the previous job.

Two-dimensional codes

OpenSWPC also provides 2D codes for the P-SV and SH models. The 2D FDM simulations use the same input parameter file, except they perform the wave propagation simulations in the x–z plane. A 2D simulation along a cross section of the 3D medium has a much lighter computational load. The horizontal direction of the cross section can be chosen arbitrarily by adjusting the coordinate rotation parameter. Note that the 2D codes adopt one-dimensional MPI domain partitioning along the x direction.

Strong-scale performance measurement

The computational efficiency of the parallel simulation was examined using a strong scaling test, where the computation time of a fixed-sized numerical model is measured for parallel computing using different CPU numbers. We performed this test on different computer systems and two different model sizes, a cluster-type computer of the Earthquake Information Center (EIC) system of the Earthquake Research Institute at the University of Tokyo with up to 36 Intel Xeon E5-2680 v3 (403.2 GFlops) CPUs, and the Earth Simulator (ES) supercomputer of the Japan Agency for Marine–Earth Science and Technology (JAMSTEC) with up to 2048 the NEC SX-ACE (256 GFlops) CPUs.

The parallel performance was measured for two model sizes: \(512 \times 512 \times 512\) and \(1024 \times 1024 \times 1024\) on the EIC and \(1024 \times 1024 \times 1024\) and \(2048 \times 2048 \times 2048\) on the ES. In these experiments, the average computation time per time step was normalized by the total number of grids. The results (Fig. 7) demonstrate a linear decrease in the computation time (a linear increase in the computational performance) with increasing numbers of CPUs for all models. This parallel performance follows a near perfect theoretical scaling, except for cases of over parallelization using very large numbers of CPUs for the small model simulations (e.g., the result of the \(1024 \times 1024 \times 1024\) model on the ES). This result supports the efficiency of the parallel simulation of OpenSWPC using a variety of numbers of CPUs from a small (2) to large (2048) number of CPUs.

Fig. 7

Speedup of the computational time of the 3D simulation code with increasing numbers of CPUs for different model sizes and computer systems. The computation time of each model is normalized by the FDM grid numbers (see text for details)

Figure 8 compares the computational speeds by adopting OpenMP/MPI hybrid computation (hybrid-MPI) relative to pure MPI (flat-MPI) computation. The measurement was taken for the model size of \(1024 \times 1024 \times 1024\) on the EIC. A CPU of the EIC has 12 CPU cores. For the flat-MPI parallel computing, the velocity structure model was partitioned and assigned into all cores, and explicit data communications between CPU cores have been made via MPI. On the other hand, the hybrid-MPI computing utilizes the shared memory among CPU cores in the CPU, and the MPI data communications were performed only for inter-CPU data exchange. The comparison shows considerable improvement in computational performance for the hybrid-MPI, up to 25% speedup compared to the flat-MPI. The speedup ratio slightly increases with increasing numbers of CPUs. These performance improvements in the hybrid programming are in particular because of the efficient reuse of the data on the cache of CPU which are shared among CPU cores (Inoue et al. 2013). We note that the efficiency of the hybrid-MPI may depend on model configurations such as model size or number of CPUs and computer architecture.

Fig. 8

Comparison of the normalized computation times between flat-MPI and OpenMP/MPI hybrid parallel computing. The experiments were conducted on the EIC computer (12 Cores/CPU) for the case of \(1024^3\) grids model with different number of CPUs

Memory requirements and maximum frequency of the 3D FDM

The maximum size of the 3D FDM simulation and the highest frequency of the seismic waves in the model are bounded by the size of the computer memory. To consider the maximum frequency of the simulation, suppose a simulation model has a square horizontal surface with an area of S and a depth of D. OpenSWPC requires a memory of \(m_{{\rm g}} = 188\) bytes per grid when using mixed-precision arithmetic calculations. Discretizing the 3D computational volume with a uniform grid spacing of h indicates that OpenSWPC requires a total memory size, \(M_{R}\), of

$$M_{R}= \frac{m_{g} S D}{h^3}.$$

(7)

Further, the spatial grid size h controls the highest frequency allowed for the FDM simulation considering the minimum wavespeed, \(v_{{\rm min}}\), such that

$$f_{{\rm max}}= \frac{v_{{\rm min}}}{7\,h},$$

(8)

where we assume that at least seven grid points per minimum wavelength are necessary to restrict the numerical dispersion of the FDM calculation within the required level. Combining the above equations (Eqs. 7, 8), the maximum frequency of the 3D FDM, fmax, can be estimated by the size of the simulation, the minimum wavespeed and the memory size of the computer Mmax such that

We evaluated the maximum frequency for the 3D FDM simulation on the EIC and the ES as a function of horizontal distance in the model and minimum wavespeed (Fig. 9). In this evaluation, we assumed a square simulation model in the horizontal directions with a size of \(\sqrt{S}\) and a fixed model depth of \(D=200\) km as a typical simulation model on a regional scale. For example, assuming a minimum S-wavespeed of 1.5 km/s, which corresponds to the P-wavespeed in the ocean column, and a horizontal scale of 200 km, OpenSWPC can simulate a high-frequency seismic wavefield up to 2 and 10 Hz using the EIC and the ES, respectively.

Fig. 9

Maximum frequency of the 3D FDM simulation bounded by the maximum memory size for a the EIC computer (2.2 TB) and b the Earth Simulator (128 TB) as a function of the horizontal size of the 3D model and the minimum value of the wavespeed in the medium. The color scale is shown in the right

To demonstrate the effectiveness of OpenSWPC, here we show examples of the seismic wave propagation simulations performed by OpenSWPC.

Comparison with the wavenumber integration method

We first demonstrate the accuracy of the simulation for a laterally homogeneous medium compared to the result of the wavenumber integration method using the FKRPOG code (Saikia 1994), which is distributed along with the TDMT_INVC moment tensor estimation code (Dreger 2003). The wavenumber integration method is often employed as a reference to validate the accuracy of synthetic seismograms in laterally homogeneous layered structures. In this experiment, we assumed the layered structure model of Kubo et al. (2002) and a double-couple point source (strike: \(70^\circ\), dip: \(30^\circ\) and rake: \(-50^\circ\)) placed at a depth of 25 km. The FDM simulation was conducted with a model of \(1024 \times 1024 \times 1024\) grid points, a grid size of 0.5 km and a time step of 0.025 s. The reference frequency for Q was set to 1 Hz to match the parameter of the wavenumber integration code. Figure 10 compares the vertical-component seismograms along the x direction obtained by the wavenumber integration method and OpenSWPC. The same band-pass filter of 0.01–0.5 Hz was applied to both traces, considering the expected maximum frequency (\(\sim\)0.89 Hz) of this simulation. There is excellent agreement between the synthetic seismograms of OpenSWPC and those of the wavenumber integration method from short to large distances and in the wide frequency range below ~1 Hz, demonstrating the effectiveness of OpenSWPC for simulations of seismic wave propagation in an elastic medium.

Fig. 10

Comparison of vertical-component velocity seismograms obtained by the frequency–wavenumber integral method of Saikia (1994) (black curve) and OpenSWPC (red curve) in the frequency range of 0.01–0.5 Hz under the laterally homogeneous structure of Kubo et al. (2002). The amplitude at each distance is normalized by the maximum amplitude of the trace obtained by the frequency–wavenumber integral method. Note that the two traces at the same distance are slightly shifted vertically so that they do not overlap

Simulation and visualization with a 3D velocity structure

The next example demonstrates the result of a simulation of seismic wave propagation in a 3D heterogeneous model of the Japan Integrated Velocity Structure Model (JIVSM; Koketsu et al. 2012). The JIVSM consists of a set of sedimentary layers, Conrad and Moho discontinuities, and Pacific and Philippine Sea Plate interfaces associated with oceanic Mohos and topography. These are represented by the depth variations of each layer and the medium parameters below each depth.

We performed a 3D FDM simulation of the seismic wave propagation for the 2005 West Off Fukuoka Prefecture (\(M_W=6.6\)) earthquake. The source location and the fault mechanism were taken from the F-net moment tensor catalog (Fukuyama et al. 1998). The simulation model has a spatial grid size of \(2000 \times 2560 \times 500\) with a grid spacing of 0.5 km and a time step of 0.025 s with the minimum wavespeed of 1.5 km/s for the shallow-most sedimentary layer. We evaluated the seismic wave propagation in the long period band using a source time function with a rise time of 20 s.

The record section of the vertical-component velocity traces (Fig. 11) at the station locations of High Sensitivity Seismograph Network Japan (Hi-net) operated by National Research Institute for Earth Science and Disaster Resilience (Okada et al. 2004) shows very complicated waves as a result of propagation in the heterogeneous structure. In particular, there are significant wave packets after the arrival of dispersive surface waves at epicentral distances over 500 km that seem to be incoherent in neighboring stations. The development of this peculiar wave packet is clearly seen in the snapshots of the seismic wave propagation shown in Fig. 12. It is demonstrated that the direct waves were first radiated from the epicenter and then were trapped in the very low-wavespeed layer of the accretionary prism, which was formed by the subduction of the Philippine Sea Plate (mark A). Then, part of the trapped seismic wave energy was released from the accretionary prism and propagated back to the Japanese Archipelago (marks B and C), as demonstrated by Furumura et al. (2008) and Guo et al. (2016). As a result, apparently incoherent arrivals appear in the record section (Fig. 11) due to significant off-great-circle propagation.

Fig. 11

Seismic wave traces simulated by OpenSWPC. a Vertical-component velocity traces in the frequency range of 0.025–0.1 Hz. The amplitude of each trace is normalized by its maximum amplitude. b An index map showing the station (circles) and source locations. Dashed curves show the iso-distance contour at every 200 km from the epicenter

Note that the visualized seismic wavefield (Fig. 12) was obtained using an associated tool included in the OpenSWPC package. This program reads the wavefield and topography from a NetCDF-formatted output file of the simulation NetCDF file, and the spatiotemporal wavefields over the topography map are plotted in sequentially numbered portable bitmap files.

Fig. 12

Example of a visualization of a seismic wave propagation simulation by OpenSWPC at the elapsed times of a 100 s, b 200 s, c 300 s and d 400 s. Superimposed on the topography map (color scale is shown in the right), scaled absolute amplitude of vertical- and horizontal-component ground velocity motions on the ground surface are shown in red and green, respectively. An additional file contains a time-sequential movie of the seismic wave propagation (Additional file 1)

Finite-fault rupture and coseismic deformation

The finite source rupture over the fault is represented in OpenSWPC using multiple sources. To demonstrate this feature, we set up a finite-fault model in a homogeneous half-space as depicted in Fig. 13. A Poisson medium (\(\alpha /\beta =\sqrt{3}\)) is assumed with an S-wavespeed of \(\beta = 3500\) m/s and a density of \(\rho = 2700\) kg/m\(^3\). No attenuation was included in the simulation. We assumed slip with a dip angle of 45\(^\circ\) and a fault size of 100 km \(\times\) 50 km. The slip amount is 7.5 m, which corresponds to a moment magnitude of \(M_W=8.0\). In this simulation, the finite-sized fault was represented by 100 \(\times\) 50 (=5000) point sources along the strike and dip directions, respectively. The moment release was equally distributed over the point sources assuming homogeneous slip on the fault. The rupture propagation starts from one corner of the fault and spreads over the fault with an assumed constant rupture speed of 2.5 km/s. The source rupture is expressed in OpenSWPC as the delay of the initiation time of each point source. The FDM simulation was performed with a 3D model of \(600 \times 600 \times 400\) grid points with a spatial grid size of 0.25 km and time step of 0.01 s.

Fig. 13

Geometry of the finite-fault simulation used to infer the coseismic deformation. The blue rectangle shows the fault plane, and the red rectangle shows the free surface

The simulation result 200 s after the rupture starting time was compared with the analytic solution of Okada (1985) (Fig. 14). The result of the FDM simulation for the coseismic deformation of the finite-fault source overall agrees with the analytic solution as demonstrated in previous studies (e.g., Wald and Graves 2001; Maeda and Furumura 2013). Note that the conventional sponge absorbing boundary condition of Cerjan et al. (1985) leads to an inaccurate estimation of final deformation pattern close to the edge of the fault. In this case, this is most significant in the horizontal dip (y) direction of the fault, and the error grows gradually with increasing time even after the termination of the fault rupture. Conversely, the estimation using the PML boundary agrees quite well with the analytic solution even if the boundary is very close to the fault.

Fig. 14

Comparison of the coseismic deformation of a an analytic solution of Okada (1985), b the result of a simulation by OpenSWPC with the PML boundary condition and c the result of a simulation by OpenSWPC with the sponge boundary condition of Cerjan et al. (1985). In each panel, three-component displacements (\(u_x, u_y\), and \(u_z\)) are plotted with the color scales shown below. The gray shaded areas in b and c indicate the absorber regions. Time-sequential movies corresponding to b and c are provided in Additional file 2, Additional file 3, respectively

Plane-wave incidence

Analysis of regional wave propagation from far-field body waves (Maeda et al. 2014), core reflected phases (Toya et al. 2017) and the amplification and reflection of seismic waves in the shallow structure can be effectively evaluated by considering the incidence of plane waves. An example of an FDM simulation for vertical incidences of plane P- and S-waves is shown in Fig. 15. This simulation was performed using the 2D P-SV code with a spatial grid size of 0.2 km and a time step of 0.01 s. The layered velocity structure model of Kubo et al. (2002) was assumed, and up-going P- and SV-wave packets with characteristic wavelengths of 20 km were set near the bottom of the simulation model as initial conditions.

Fig. 15

The seismic traces for vertical plane-wave incidences obtained by the 2D P-SV simulations. a Vertical-component traces for P-wave incidence with the sponge boundary condition. b Horizontal-component traces for S-wave incidence with the sponge boundary condition. c, d same as a, b but with the PML boundary condition. Waveform traces within the absorbing layer are shown as green curves, and traces at the center of the model are shown as thick red curves

It was found that the use of the PML absorbing boundary condition is essential for FDM simulations with incident plane waves as well as with coseismic deformation. With the use of the conventional sponge absorbing boundary, artificial reflections from the edge of the model at both sides of the plane wave make it difficult to recognize later arrivals of reflected and converted phases (Fig. 15a). Such artificial reflections are inevitable for plane-wave incidence because the attenuation of up-going plane waves in the absorbing boundary leads to artificial diffraction propagating out of the computational domain, and a part of this packet reflects back inside the model. The situation is much worse for S-wave incidence (Fig. 15b) due to strong S-to-P conversion at the boundary. Because the PML boundary condition does not modify seismic waves propagating parallel to the boundary, it minimizes the occurrence of such artificial reflections (Fig. 15c, d).

Simulations using the reciprocity theorem

An example of a reciprocity calculation comparing an ordinary forward simulation from the source to receiver and a simulation using the reciprocity theorem with exchanged source and receiver is shown in Fig. 16. In this simulation, we adopted the JIVSM model (Koketsu et al. 2012) in northeastern Japan including the topography variation and seawater column and calculated the seismic waveforms at a F-net station location (N.KSNF) from a source on the Pacific Plate boundary. Both simulations were performed with the same model size of \(3000 \times 2400 \times 600\) grid points with a discretization of 0.25 km in space and 0.01 s in time. The seismic source is assumed to have a finite source duration time of 4 s.

Fig. 16

An example of a computation using the reciprocity theorem. a Three-component Green’s functions of the six moment tensor components (listed on the left). Black and blue curves represent the traces calculated using the forward simulation and the reciprocity theorem, respectively. Red curves show the residual between the forward and reciprocity calculations. b An index map showing the locations of the epicenter (star) and the observed station (triangle)

Figure 16 compares the two simulations without applying a filter. They show excellent agreement with very small residual amplitudes between the two results. Note that the reciprocity theorem (Aki and Richards 2002) assures the exchange of the source and receiver positions only if the absorbing and free surface boundary conditions are satisfied (Eisner and Clayton 2001). Therefore, the good agreement between the two simulations demonstrates the effectiveness of the free surface and the absorbing boundary conditions in the FDM simulation.

This newly developed FDM simulation code for modeling of seismic wave propagation in heterogeneous viscoelastic media in 2D and 3D, OpenSWPC, has wide applicability for seismological studies and great portability, allowing excellent performance from PC clusters to supercomputers. OpenSWPC was designed to apply various models from small to regional scales with the use of various seismic source representations and velocity models. Note that all examples presented in this study were obtained by setting necessary parameters and velocity model files without modifying the simulation code itself. Using OpenSWPC, we can simulate seismic waves at regional scales of up to approximately 1 Hz using PC clusters and up to approximately 10 Hz using high-performance supercomputers. All the codes are available with detailed documents at a public repository with an open-source (MIT) license.

In OpenSWPC, we implemented a frequency-independent attenuation model by adopting the GZB. The assumption of frequency-independent attenuation is valid in the low-frequency regime below \(\sim 1\) Hz, at least to a first-order approximation (e.g., Aki and Richards 2002). However, at frequencies above about 1 Hz, attenuation appears to follow a power-law frequency dependence (e.g., Carcolé and Sato 2010; Phillips et al. 2014). Incorporation of such different frequency-dependent attenuation properties for high frequencies can also be modeled by applying the memory variable approach (Withers et al. 2015). Such implementation of more realistic attenuation characteristics to OpenSWPC is a prospect for future development.

The OpenSWPC adopts the Cartesian coordinate system to mainly apply to small-to-regional-scale seismic wave modeling. Recently, Takenaka et al. (2017) proposed a method to incorporate the effect of spherical geometry of the Earth into FDM simulation model in the Cartesian coordinate system. Incorporation of such technique may lead wider applications of OpenSWPC in global seismology without significant modification of present set of the code.

Authors’ contributions

TM developed the code based on the original version of the software developed by TF performed simulations with ST, and drafted the manuscript. ST contributed to the design and application of the numerical tests of the code. TF conceived the study and participated in the conception and design of the study. All authors participated in discussions and equally contributed to revising the draft of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We appreciate Phil Cummins, the editor, and two anonymous reviewers for their constructive comments that greatly improved the manuscript. This study was supported by a Collaborative Research Program of the Earthquake Research Institute at the University of Tokyo (2015-B-01). TM was partly supported by the Core-to-Core Collaborative Research Program of the Earthquake Research Institute at the University of Tokyo and the Disaster Prevention Research Institute at Kyoto University (2016-K-06). We used the F-net moment tensor catalog and station location information of the National Research Institute for Earth Science and Disaster Resilience. We used computer resources of the EIC system of the Earthquake Research Institute at the University of Tokyo and the Earth Simulator of the Japan Agency for Marine–Earth Science and Technology.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Additional files

40623_2017_687_MOESM2_ESM.mp4Additional file 2. A time-sequential movie of the displacement field from a finite-fault rupture with the PML absorbing boundary condition associated with Fig. 14. (81.5 KB).

40623_2017_687_MOESM3_ESM.mp4Additional file 3. A time-sequential movie of the displacement field from a finite-fault rupture with the sponge absorbing boundary condition associated with Fig. 14. (94.7 KB).