Magnetohydrodynamics (MHD) studies the dynamics of an electrically conducting fluid under the influence of a magnetic field. Many astrophysical phenomena are related to MHD, and computer simulations are used to model these dynamics. In this thesis, we conduct MHD simulations of non-radiative black hole accretion as well as fast magnetic reconnection. By performing large scale three dimensional parallel MHD simulations on supercomputers and using a deformed-mesh algorithm, we were able to conduct very high dynamical range simulations of black hole accretion of Sgr A* at the Galactic Center. We find a generic set of solutions, and make specific predictions for currently feasible observations of rotation measure (RM). The magnetized accretion flow is subsonic and lacks outward convection flux, making the accretion rate very small and having a density slope of around $-1$. There is no tendency for the flows to become rotationally supported, and the slow time variability of the RM is a key quantitative signature of this accretion flow. We also provide a constructive numerical example of fast magnetic reconnection in a three-dimensional periodic box. Reconnection is initiated by a strong, localized perturbation to the field lines and the solution is intrinsically three-dimensional. Approximately $30%$ of the magnetic energy is released in an event which lasts about one Alfv’en time, but only after a delay during which the field lines evolve into a critical configuration. In the co-moving frame of the reconnection regions, reconnection occurs through an X-like point, analogous to the Petschek reconnection. The dynamics appear to be driven by global flows rather than local processes. In addition to issues pertaining to physics, we present results on the acceleration of MHD simulations using heterogeneous computing systems cite{shan2006heterogeneous}. We have implemented the MHD code on a variety of heterogeneous and multi-core architectures (multi-core x86, Cell, Nvidia and ATI GPU) using different languages (FORTRAN, C, Cell, CUDA and OpenCL). Initial performance results for these systems are presented, and we conclude that substantial gains in performance over traditional systems are possible. In particular, it is possible to extract a greater percentage of peak theoretical performance from some heterogeneous systems when compared to x86 architectures.