[1] Ken Prager. Datenkompression in CUDA mittels Hierarchischer Basen. Bakkalaureatsarbeit, Institut für Angewandte Mathematik, Universität Jena, 2014. [ bib ]
[2] G. Zumbusch. Vectorized higher order finite difference kernels. In P. Manninen and P. Öster, editors, PARA 2012, State-of-the-Art in Scientific and Parallel Computing, volume 7782 of LNCS, pages 343-357, Heidelberg, 2013. Springer. [ bib | .ps.gz | .pdf ]
Several highly optimized implementations of Finite Difference schemes are discussed. The combination of vectorization and an interleaved data layout, spatial and temporal loop tiling algorithms, loop unrolling, and parameter tuning lead to efficient computational kernels in one to three spatial dimensions, truncation errors of order two to twelve, and isotropic and compact anisotropic stencils. The kernels are implemented on and tuned for several processor architectures like recent Intel Sandy Bridge, Ivy Bridge and AMD Bulldozer CPU cores, all with AVX vector instructions as well as Nvidia Kepler and Fermi and AMD Southern and Northern Islands GPU architectures, as well as some older architectures for comparison. The kernels are either based on a cache aware spatial loop or on time-slicing to compute several time steps at once. Furthermore, vector components can either be independent, grouped in short vectors of SSE, AVX or GPU warp size or in larger virtual vectors with explicit synchronization. The optimal choice of the algorithm and its parameters depend both on the Finite Difference stencil and on the processor architecture.

[3] Kevin Gericke. Einwicklung eines numerischen Molekülnynamikcodes für Kettenmoleküle. wiss. Hausarbeit Staatsexamen, Universität Jena, 2013. [ bib ]
[4] Jörg Feierabend. Spektralverfahren auf GPUs. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2013. [ bib ]
[5] G. Zumbusch. Tuning a finite difference stencil. Poster, GPU Technology Conference 2012 (GTC), San Jose, CA, 2012. [ bib | .pdf ]
[6] G. Zumbusch. Tuning a finite difference computation for parallel vector processors. In M. Bader, H.-J. Bungartz, D. Grigoras, M. Mehl, R.-P. Mundani, and R. Potolea, editors, 2012 11th International Symposium on Parallel and Distributed Computing, CPS, pages 63-70. IEEE Press, 2012. DOI 10.1109/ISPDC.2012.17. [ bib | .pdf ]
Current CPU and GPU architectures heavily use data and instruction parallelism at different levels. Floating point operations are organised in vector instructions of increasing vector length. For reasons of performance it is mandatory to use the vector instructions efficiently. Several ways of tuning a model problem finite difference stencil computation are discussed. The combination of vectorisation and an interleaved data layout, cache aware algorithms, loop unrolling, parallelisation and parameter tuning lead to optimised implementations at a level of 90% peak performance of the floating point pipelines on recent Intel Sandy Bridge and AMD Bulldozer CPU cores, both with AVX vector instructions as well as on Nvidia Fermi/ Kepler GPU architectures. Furthermore, we present numbers for parallel multi-core/ multi-processor and multi-GPU configurations. They represent regularly more than an order of speed up compared to a standard implementation. The analysis may also explain deficiencies of automatic vectorisation for linear data layout and serve as a foundation of efficient implementations of more complex expressions.

[7] G. Zumbusch. Galerkin schemes for general relativity. Poster, Advances and Challenges in Computational General Relativity (ACCGR), Brown University, 2011. [ bib | .pdf ]
Numerical schemes for Einstein's vacuum equation of general relativity are developed. The equation in harmonic gauge is discretized in space-time by Galerkin methods. A split into space and time leads to time-stepping schemes for second order wave equations. Finite Element and Discontinuous Galerkin are covered along with local mesh refinement in space-time.

[8] Sven Radszuwill. Effiziente Lösung der Poissongleichung mit Multicore-Anwendungen. Bachelorarbeit, Institut für Angewandte Mathematik, Universität Jena, 2011. [ bib ]
[9] Richard Henze. Effizientes Lösen der Wärmeleitungsgleichung auf einer GPU. Bachelorarbeit, Institut für Angewandte Mathematik, Universität Jena, 2011. [ bib ]
[10] Roland Wickles. Konvergenzuntersuchungen bei Lösungen der Wellengleichung mittels finiter Differenzen. Bachelorarbeit, Institut für Angewandte Mathematik, Universität Jena, 2011. [ bib ]
[11] Katharina Büchse. Datenstrukturanalyse für die adaptive Finite-Elemente-Methode. Diplomarbeit, STIFT-Preis 2010, Institut für Angewandte Mathematik, Universität Jena, 2010. [ bib ]
In der vorliegenden Arbeit werden Herausforderungen diskutiert, die sich durch die Verwendung unterschiedlicher Speichermedien, hier in erster Linie Hauptspeicher und Festplatte, beim Lösen von adaptiven finiten Elementen ergeben. Es werden Datenstrukturen auf ihre Eignung für diese konkrete mathematische Problemstellung analysiert und Mittel bereitgestellt, welche ohne viel Aktualisierungsaufwand die Arbeit mit den benötigten Daten ermöglichen. Der eigentliche Rechenablauf wird dahingehend angepasst, dass der Datenaustausch zwischen Hauptspeicher und Festplatte, der einen (Performance-)Engpass darstellt, so stark wie möglich eingeschränkt wird.

[12] Gernot Geppert. Numerische Methoden zur Lösung nichtlinearer, schlecht gestellter 3D-Tomographie-Probleme aus der Atmosphärenfernerkundung. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2010. [ bib ]
[13] G. Zumbusch. Finite Element, Discontinuous Galerkin, and Finite Difference evolution schemes in spacetime. Class. Quantum Grav., 26:175011, 2009. [ bib | .ps | .pdf ]
Numerical schemes for the vacuum Einstein equations are developed. The Einstein equation in harmonic gauge is second order symmetric hyperbolic. It is discretized in four-dimensional spacetime by Finite Differences, Finite Elements, and Interior Penalty Discontinuous Galerkin methods, the latter related to Regge calculus. The schemes are split into space and time and new time-stepping schemes for wave equations are derived. The methods are evaluated for linear and non-linear test problems of the Apples-with-Apples collection.

[14] G. Zumbusch. Portable multi-level parallel programming for cell processor, gpu, and clusters. In Proc. Para08, LNCS, Heidelberg, 2009. Springer. [ bib | http | .ps.gz | .pdf ]
High performance computers offer lots of parallelism at different levels of vectorization, thread parallelism, message-passing between distributed memory architectures and even function off-loading by hardware accelerators. Large scale numerical simulations often have lots of parallelism, which may be difficult to express in a high level programming language. A common abstract parallel programming style is proposed, which can be translated automatically into parallel code for one or a combination of common programming styles for different parallel architectures.

[15] Frank Peuker. Simplicial Methods for Solving Selected Problems in General Relativity Numerically. Regge Calculus and the Finite-Element Method. PhD thesis, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib | .pdf ]
[16] Marcus Fritzsche. Parallel numerical simulation of Navier-Stokes-equations on GPUs. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib | .pdf ]
This diploma thesis is about solving the Navier-Stokes equations numerically and emphases parallelizing on GPUs. The work has shown that the computation of the Navier-Stokes equations can be accelerated significantly by using GPUs. The speedup factor is at least 20 which has been shown by the simulations. It is the result of parallelizing the SOR-method which is used to solve the discretization of the poisson pressure equation.

[17] Till W. Riffert. Modellierung und Approximation von Zufallsfeldern mit Methoden der hierarchischen Matrizen. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib ]
[18] Christian Reibiger. Lösung elliptischer Randwertprobleme mit Hilfe der CUDA Technologie. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib | .pdf ]
In dieser Arbeit wurde zunächst ein Krylov-Unterraum-Verfahren zur Lösung linearer algebraischer Gleichungssysteme für spezielle dünnbesetzte Matrizen (hier als 27-Diagonal-Matrizen bezeichnet) implementiert. Solche Matrizen erhält man insbesondere bei der Diskretisierung von RWA. Durch die Auslagerung großer Teile des Programms auf die vielen parallel arbeitenden Prozessoren einer Grafikkarte konnte bereits eine beachtliche Rechenleistung erzielt werden. Danach wurde durch das Implementieren und Testen von einigen Varianten der Gebietszerlegungs- und Mehrgitterverfahren eine Beschleunigung des Lösers erreicht. Es ist zu erwarten, dass bei einer weiterentwickelten Grafikkarte eine noch stärkere Beschleunigung erreicht werden kann, weil dann vermutlich größere Teilgebiete in den Vorkonditionierern verwendt werden können. Abschließend wurde die theoretische Grundlage gebildet, um das entwickelte Programm zur Lösung einer nichtlinearen RWA aus der Astrophysik anzuwenden.

[19] Anja Boos. Schurkomplement-Vorkonditionierer für pseudospektrale Diskretisierungen. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib | .pdf ]
Ziel der vorliegenden Arbeit ist die Überprüfung der Vor- und Nachteile von Spektralmethoden. In der Praxis werden häufig Finite-Elemente-Methoden genutzt, die stückweise lineare Funktionen verwenden. Der Vorteil daran liegt auf der Hand. Die Steifigkeitsmatrix ist dünnbesetzt und kann somit schnell gelöst werden. In der vorliegenden Arbeit wird eine partielle Differentialgleichung auf einem gegebenen Gebiet mit Dirichlet- oder Cauchy-Randbedingungen numerisch mittels Spektralmethoden gelöst. Die Umsetzung erfolgt in der Programmiersprache C/C++. Im Mittelpunkt stehen die Möglichkeiten den Zeit- und Rechenaufwand zu optimieren. Dafür werden verschiedene Ansätze miteinander zu verglichen. Dies können neben Gebietszerlegungsmethoden auch verschiedene Vorkonditionierer für die jeweiligen Teilprobleme sein.

[20] Kathleen Schuhmacher. Ausgewählte numerische Probleme im gymnasialen Mathematikuntericht. wiss. Hausarbeit Staatsexamen, Institut für Angewandte Mathematik, Universität Jena, 2009. [ bib ]
[21] G. Zumbusch. A container-iterator parallel programming model. In R. Wyrzykowskii, J. Dongarra, K. Karczewski, and J. Wasniewski (PPAM 2007), editors, Parallel Processing and Applied Mathematics, volume 4967 of LNCS, pages 1130-1139, Heidelberg, 2008. Springer. [ bib | http | .ps.gz | .pdf ]
There are several parallel programming models available for numerical computations at different levels of expressibility and ease of use. For the development of new domain specific programming models, a splitting into a distributed data container and parallel data iterators is proposed. Data distribution is implemented in application specific libraries. Data iterators are directly analysed and compiled automatically into parallel code. Target architectures of the source-to-source translation include shared (pthreads, Cell SPE), distributed memory (MPI) and hybrid programming styles. A model applications for grid based hierarchical numerical methods and an auto-parallelizing compiler are introduced.

[22] Thomas Franz. Schnelle Multipol-Methode. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2008. [ bib | .pdf ]
Diese Arbeit befaßt sich mit der Berechnung von Wechselwirkungen zwischen geladenen Partikeln nach dem Coulombschen Gesetz. Werden die Kräfte für jedes Partikel einzeln berechnet, führt dies zu einem Algorithmus der Ordnung O(N2). Dies ist in der Praxis für große Partikelzahlen nicht praktikabel, wenn nicht sogar ummöglich, zu berechnen. Deshalb befassen sich vieleWissenschaftler mit Algorithmen, die diese Berechnungen vereinfachen und stark beschleunigen. Im Wesentlichen baut diese Arbeit auf den Artikeln von L. Greengard, V. Rohklin, R. Beatson und J. Carrier. Sie entwickelten die Schnelle Multipol-Methode, einen Algorithmus, der die Ordnung O(N log N) besitzt. Dadurch ist es möglich, auch für eine sehr große Partikelanzahl die Wechelwirkungen zu berechnen. Ziel der Diplomarbeit war die Umsetzung des Multipol-Algorithmus in JAVA, wobei hier besonders großer Wert auf die Parallelisierung mit Hilfe von Threads gelegt wurde.

[23] M. Griebel, S. Knapek, and G. Zumbusch. Numerical Simulation in Molecular Dynamics: Numerics, Algorithms, Parallelization, Applications, volume 5 of Texts in Computational Science and Engineering. Springer, Berlin, Heidelberg, 2007. [ bib | http ]
Particle models play an important role in many applications in physics, chemistry and biology. These can be studied on the computer with the help of molecular dynamics simulations. This book presents in detail the necessary numerical methods, the theoretical background and foundations and the techniques involved, including linked-cell method, SPME-method, tree codes, amd multipol technique. It illustrates such aspects as modeling, discretization, algorithms and their parallel implementation with MPI on computer systems with distributed memory. The text goes on to offer detailed explanations of the different steps of numerical simulation, providing illustrative code examples. With the description of the algorithms and the presentation of the results of various simulations from fields such as material science, nanotechnology, biochemistry and astrophysics, the reader of this book will learn step by step how to write programs capable of running successful experiments for molecular dynamics.

[24] G. Zumbusch. Data dependence analysis for parallel tree codes. In B. Kågström, E. Elmroth, J. Dongarra, and J. Wasniewski, editors, Applied Parallel Computing. State of the Art in Scientific Computing (PARA 2006), volume 4699 of LNCS, pages 890-899, Heidelberg, 2007. Springer. DOI 10.1007/978-3-540-75755-9_106. [ bib | http | .ps.gz | .pdf ]
Data dependence analysis for automatic parallelization of sequential tree codes is discussed. Hierarchical numerical algorithms often use tree data structures for unbalanced, adaptively and dynamically created trees. Moreover, such codes often do not follow a strict divide and conquer concept, but introduce some geometric neighborhood data dependence in addition to parent-children dependencies. Hence, recognition mechanisms and hierarchical partition strategies of trees are not sufficient for automatic parallelization. Generic tree traversal operators are proposed as a domain specific language. Additional geometric data dependence can be specified by code annotation. A code transformation system with data dependence analysis is implemented, which generates several versions of parallel codes for different programming models.

[25] Mathias Langner. Numerische Behandlung von contingent claims. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2007. [ bib | .pdf ]
Die vorliegende Arbeit befasst sich mit der numerischen Evaluierung von zweidimensionalen Finanzderivaten. Dafür werden zunächst partielle Differentialgleichungen vorgestellt, die die Entwicklung dieser Derivate beschreiben. Zur Diskretisierung der Differentialgleichungen wird eine finite Volumen Methode verwendet. Im Laufe der Arbeit werden an dem Verfahren einige Modifikationen vorgenommen. Zunächst wird auf Konvektionsdominanz eingegangen. Um stabile Verfahren mit hoher Konvergenz zu erhalten, werden zwei flux-Limiter vorgestellt: Der van Leer und van Albada Limiter. Da es sich bei beiden Limitern um TVD-Schemata handelt und diese höchstens lineare Konvergenz erreichen, werden modifizierte Limiter-Schemata entwickelt, eine Kombination aus zentralen Differenzen und flux- Limitern. Eine weitere Veränderung des Verfahrens besteht in der Gittermodifikation. Gradierte Gitter erlauben eine gezielte lokale Verfeinerung bei gleich bleibender Anzahl an Diskretisierungspunkten. Dünne Gitter verfolgen einen Ansatz mit hierarchischen Basen und erlauben eine Berechnung mit wesentlich weniger Speicheraufwand, aber einer nur in geringem Maße schlechteren Fehlerentwicklung. Ein Nachteil dünner Gitter besteht in recht komplizierten Strukturen, die durch die hierarchischen Basen entstehen. Eine Alternative ist die Kombinationstechnik, die das Gesamtproblem in mehrere Teilprobleme zerlegt, welche wiederum mit bekannten Verfahren gelöst werden können. Die Entwicklung des Fehlers ist äquivalent zu dem auf dünnen Gittern.

[26] G. Zumbusch. Data parallel iterators for hierarchical grid and tree algorithms. In W. E. Nagel, W. V. Walter, and W. Lehner, editors, Euro-Par 2006 Parallel Processing, volume 4128 of LNCS, pages 625-634, Heidelberg, 2006. Springer. DOI 10.1007/11823285_65. [ bib | http | .ps.gz | .pdf ]
The data parallel programming language construct of a “foreach” loop is proposed in the context of hierarchically nested arrays and unbalanced k-ary trees used in high performance applications. In order perform an initial evaluation, an implementation of an automatic parallelization system for C++ programs is introduced, which consists of a preprocessor and a matching library for distributed memory, shared memory and mixed model parallelism. For a full compile time dependence analysis and a tight distributed memory parallelization, some additional application knowledge about alignment of arrays or indirect data access can be put into the application's code data declarations. Results for a multigrid and a fast multipole benchmark code illustrate the concept.

[27] Stefan Fleischer. Numerik von Optionsgeschäften unter Zuhilfenahme der Kombinationstechnik. Diplomarbeit, Institut für Angewandte Mathematik, Universität Jena, 2005. [ bib | .pdf ]
Für europäische und amerikanische Optionen sowie für zwei- und dreidimensionale Basket- Optionen wird aus den Black-Scholes-Gleichungen der faire Preis numerisch berechnet. Da in den meisten Fällen keine geschlossenen Lösungsformeln exisiteren, müssen die multivariaten parabolischen Optionspreisaufgaben numerisch gelöst werden. Die Eindeutigkeit der Gleichungen wird über die Anfangs- und Randwertprobleme garantiert. Das Berechnen von Näherungslösungen für diese Differentialgleichungen geschieht dann durch Diskretisierung der Gleichungen und durch das Lösen der daraus resultierenden Gleichungssysteme. Die Diskretisierung der parabolischen Gleichungen erfolgt mit Finiten Differenzen auf stark anisotropen, regulären und rechtwinkligen Vollgittern. Bei amerikanischen Optionen wird das Hindernisproblem über das Projektions-Successive-Overrelaxation-Iterationsverfahren numerisch gelöst. Die Basket-Optionen beruhen auf Advektions-Diffusions-Reaktions-Gleichungen. Um den numerischen Aufwand vertretbar zu halten, wird die Kombinationstechnik basierend auf der Dünngitter-Theorie eingesetzt. Der numerische Aufwand bei steigender Problemdimension kann dabei substantiell reduziert werden, ohne den Informationsgehalt der Lösung wesentlich zu verschlechtern. Damit kann entscheidend dem “Fluch der Dimensionen“ entgegengewirkt und die Komplexität der Algorithmen verbessert werden. Die unterschiedlichen Lösungen auf den stark anisotropen, regulären, rechtwinkligen und gröberen Dünngittern werden miteinander geeignet linear kombiniert und liefern, bis auf eine logarithmische Dämpfung, denselben numerischen Fehler wie die numerische Lösung auf einem feineren isotropen Referenzvollgitter. Anhand von empirischen Ergebnissen soll die vorgestellte Theorie bestätigt werden.

[28] M. Griebel, S. Knapek, G. Zumbusch, and A. Caglar. Numerische Simulation in der Moleküldynamik. Numerik, Algorithmen, Parallelisierung, Anwendungen. Springer, Berlin, Heidelberg, 2004. [ bib | http ]
Das Lehrbuch führt in die wichtigsten Simulationstechniken zur numerischen Behandlung der Newtonschen Bewegungsgleichungen ein. Der Schwerpunkt liegt hierbei auf der schnellen Auswertung kurz- und langreichweitiger Kräfte mittels Linked Cell-, P3M-, Baum- und Multipol-Verfahren sowie deren paralleler Implementierung und Lastbalancierung auf Rechensystemen mit verteiltem Speicher. Die einzelnen Kapitel bieten detaillierte Hinweise, um die Verfahren Schritt für Schritt in ein Programmpaket umzusetzen. Zahlreiche farbige Abbildungen enthalten Simulationsergebnisse für eine Reihe von Anwendungen.

[29] X. Cai, A. M. Bruaset, H. P. Langtangen, G. T. Lines, K. Samuelsson, W. Shen, A. Tveito, and G. Zumbusch. Performance modeling of pde solvers. In H. P. Langtangen and A. Tveito, editors, Advanced Topics in Computational Partial Differential Equations, volume 33 of Lecture Notes in Computational Science and Engineering, chapter 9, pages 361-400. Springer, Berlin, Germany, 2003. [ bib | .ps.gz | .pdf ]
The primary purpose of this report is to collect some information on the CPU-time consumption in a series of typical numerical simulations for solving partial differential equations (PDEs). We would like to establish, through analyzing the CPU-measurements, the performance model for a number of numerical methods when applied in different model problems. All the simulators, i.e. the software programs carrying out the simulations, have been developed using Diffpack which is a generic C++ library based on object-oriented programming techniques. Therefore, the established performance models may offer a rough prediction of real CPU-time consumption by actual Diffpack simulators for practical problems. Additionally, the report can also be regarded as an investigation of the computational efficiency of the current Diffpack implementations. Last but not least, we wish to point out to Diffpack programmers some implementation issues that may affect the performance of the simulators.

[30] K.-A. Mardal, G. W. Zumbusch, and H. P. Langtangen. Software tools for multigrid methods. In H. P. Langtangen and A. Tveito, editors, Advanced Topics in Computational Partial Differential Equations, volume 33 of Lecture Notes in Computational Science and Engineering, chapter 3, pages 97-152. Springer, Berlin, Germany, 2003. [ bib ]

[31] G. Zumbusch. Parallel Multilevel Methods. Adaptive Mesh Refinement and Loadbalancing. Advances in Numerical Mathematics. Teubner, 2003. [ bib | http ]
Main aspects of the efficient treatment of partial differential equations are discretisation, multilevel/multigrid solution and parallelisation. These distinct topics are covered from the historical background to modern developments. It is demonstrated how the ingredients can be put together to give an adaptive and parallel multilevel approach for the solution of elliptic boundary value problems. Error estimators and adaptive grid refinement techniques for ordinary and for sparse grid discretisations are presented. Different types of additive and multiplicative multilevel solvers are discussed with respect to parallel implementation and application to adaptive refined grids. Efficiency issues are treated both for the sequential multilevel methods and for the parallel version by hash table storage techniques. Finally, space-filling curve enumeration for parallel load balancing and processor cache efficiency are discussed.

[32] M. Griebel and G. Zumbusch. Hash based adaptive parallel multilevel methods with space-filling curves. In Horst Rollnik and Dietrich Wolf, editors, NIC Symposium 2001, volume 9 of NIC Series, ISBN 3-00-009055-X, pages 479-492, Germany, 2002. Forschungszentrum Jülich. [ bib | .ps.gz | .pdf ]
The solution of partial differential equations on a parallel computer usually follows the data parallel paradigm. The grid is partitioned and mapped onto the processors. In this paper a parallelisable and cheap method based on space-filling curves is proposed. The partitioning is embedded into the parallel solution algorithm using multilevel iterative solvers and adaptive grid refinement. Numerical experiments on two massively parallel computers prove the efficiency of this approach.

[33] G. Zumbusch. Load balancing for adaptively refined grids. Proc. Appl. Math. Mech., (1):534-537, 2002. also as report 722 SFB 256, University Bonn. [ bib | .ps.gz | .pdf ]
The solution of partial differential equations on a parallel computer is usually done by a data parallel approach. The grid is partitioned and mapped onto the processors. However, partitioning of unstructured meshes and adaptively refined meshes in general is an NP-hard problem and heuristics are needed. In this paper a parallelisable and cheap method based on space-filling curves is analysed. Quasi-optimal estimates are derived for partitions of adaptively refined grids.

[34] G. W. Zumbusch. Adaptive Parallel Multilevel Methods for Partial Differential Equations. Habilitation, Universität Bonn, 2001. [ bib ]
In this text we propose a space-filling curve enumeration scheme for the load balancing problem. It is based on the principles of self-similarity and scaling invariance. It provides even multilevel locality, i.e. as much locality on each scale as possible. We introduce the space-filling curve schemes and prove some of the properties of the partitions. The scheme is cheap, deterministic, incremental, can be parallelised and provides acceptable partitions. However, even more striking, it seems to be one of the few partitioning methods where quasi-optimal estimates can be shown. We are able to derive sharp estimates both on the partition and on the multilevel algorithms on the partition, which is more than is known about competing graph partitioning load balancing methods so far. Furthermore, we give a survey of the three main aspects of the efficient treatment of PDEs, that is, discretisation, multilevel solution and parallelisation. We will treat the main features of each of the three distinct topics and cover the historical background and modern developments. We demonstrate how all three ingredients can be put together to give an adaptive and parallel multilevel approach for the solution of PDEs. Error estimators and adaptive grid refinement techniques for ordinary and for sparse grid discretisations are presented. Different types of additive and multiplicative multilevel solvers are discussed with respect to parallel implementation and application to adaptive refined grids. Efficiency issues are treated both for the sequential multilevel methods and for the parallel version by hash table storage techniques. Furthermore, space-filling curve enumeration for parallel load balancing and processor cache efficiency are discussed. We will apply the method to elliptic boundary value problems. We are able to derive estimates for the quality of the partitions by space-filling curves and the load balancing of the numerical algorithms on the grids. Even for adaptive grid refinement within certain ranges we are able to prove that the partitions are quasi-optimal, i.e. the cut sizes of the dual graph are only a constant factor away from optimum independent of the mesh size. Hence we obtain asymptotic optimality of the parallel algorithms. This seems to be remarkable in comparison to graph based heuristics, where little is known about the quality. Furthermore we were able to demonstrate the performance of the method on a range of the world's largest parallel computers, namely ASCI Blue Pacific and a prototype Cray T3E (now presumably at NSA), which are each larger than any non-US system. We complement this data by simulations run on Parnass2, which was the first non-US self-made cluster in the list of the world's largest 500 computers (TOP500). We also demonstrate that this cluster is able to outperform many other commercial parallel computers on a per processor base.

[35] G. W. Zumbusch. On the quality of space-filling curve induced partitions. Z. Angew. Math. Mech., 81:25-28, 2001. Suppl. 1, also as report SFB 256, University Bonn, no. 674, 2000. [ bib | .ps.gz | .pdf ]
The solution of partial differential equations on a parallel computer is usually done by a domain decomposition approach. The mesh is split into several partitions mapped onto the processors. However, partitioning of unstructured meshes and adaptive refined meshes in general is an NP-hard problem and heuristics are used. In this paper space-filling curve based partition methods are analysed and bounds for the quality of the partitions are given. Furthermore estimates for parallel numerical algorithms such as multigrid and wavelet methods on these partitions are derived.

[36] M. Griebel, F. Kiefer, and G. Zumbusch. Vorgänge möglichst realitätsnah simulieren. Wissenschaftliches Rechnen als neue Dimension in der Forschung. Bonner Universitätsnachrichten, 217:48-49, January 2000. also as `Simulating Processes as Realistically as Possible', Bonn University News International (BUNI), 7:28-29, November 2000. [ bib | .html | .ps.gz | .pdf ]
Computer werden immer schneller und leistungsfähiger. Das gilt nicht nur für den Personalcomputer im Büro, sondern auch für die weltweit schnellsten Hochleistungs-Parallellrechner, die heute schon mehr als eine Billion Rechen-Operationen in nur einer Sekunde ausführen können. Insbesondere durch sie hat sich mit dem "Wissenschaftlichen Rechnen" und der "Numerischen Simulation" in den Naturwissenschaften neben dem traditionellen praktischen Weg (Experiment und Beobachtung) und dem theoretischen Ansatz (mathematische Formulierung) ein vielversprechender dritter Weg herausgebildet, um die Wirklichkeit zu beschreiben. In Bonn arbeiten die Spezialisten dieses Fachgebiets in der Abteilung für Wissenschaftliches Rechnen und Numerische Simulation im Institut für Angewandte Mathematik.

[37] M. Griebel and G. Zumbusch, editors. Computing, volume 64(4). Springer, Vienna, Austria, 2000. (guest editors) special issue multigrid methods. [ bib | http ]
In October 1998 the tenth workshop in a series of biannual GAMM seminars on multigrid methods was held. Almost two decades have passed since the first one, and the topics of the seminars provide a good insight into the progress during this period. The series began in the former GDR and is nowadays organised in cooperation with the GAMM-Committees for “Discretization Methods in Solid Mechanics” and “Efficient Numerical Methods for PDEs”. As this was the tenth anniversary of the series, the rather general title was chosen “International GAMM-Workshop on Multigrid Methods”...

[38] M. Griebel and G. W. Zumbusch. Parallel adaptive subspace correction schemes with applications to elasticity. Computer Methods in Applied Mechanics and Engineering, 184:303-332, 2000. [ bib | .ps.gz | .pdf ]
In this paper, we give a survey on the three main aspects of the efficient treatment of PDEs, i.e. adaptive discretization, multilevel solution and parallelization. We emphasize the abstract approach of subspace correction schemes and summarize its convergence theory. Then, we give the main features of each of the three distinct topics and treat the historical background and modern developments. Furthermore, we demonstrate how all three ingredients can be put together to give an adaptive and parallel multilevel approach for the solution of elliptic PDEs and especially of linear elasticity problems. We report on numerical experiments for the adaptive parallel multilevel solution of some test problems, namely the Poisson equation and Lamé's equation. Here, we emphasize the parallel efficiency of the adaptive code even for simple test problems with little work to distribute, which is achieved through hash storage techniques and space-filling curves.

[39] M. Griebel and G. W. Zumbusch. Preface. Computing, 64(4):287, 2000. (guest editors) special issue multigrid methods. [ bib | http ]
In October 1998 the tenth workshop in a series of biannual GAMM seminars on multigrid methods was held. Almost two decades have passed since the first one, and the topics of the seminars provide a good insight into the progress during this period. The series began in the former GDR and is nowadays organised in cooperation with the GAMM-Committees for “Discretization Methods in Solid Mechanics” and “Efficient Numerical Methods for PDEs”. As this was the tenth anniversary of the series, the rather general title was chosen “International GAMM-Workshop on Multigrid Methods”...

[40] R. Hochmuth, S. Knapek, and G. Zumbusch. Tensor products of Sobolev spaces and applications. submitted, 2000. also as Technical Report 685, SFB 256, Univ. Bonn. [ bib | .ps.gz | .pdf ]
In many cases the approximation of solutions to variational problems involving isotropic Sobolev spaces has a complexity which depends exponentially on the dimension. However, if the solutions possess dominating mixed derivatives one can find discretizations to the corresponding variational problems with a lower complexity - sometimes even independent of the dimension. In order to analyse these effects, we relate tensor products of Sobolev spaces with spaces with dominating mixed derivatives. Based on these considerations we construct families of finite dimensional anisotropic approximation spaces which generalize in particular sparse grids. The obtained estimates demonstrate, in which cases a complexity independent or nearly independent of the dimension can be expected. Finally numerical experiments demonstrate the usefulness of the suggested approximation spaces.

[41] G. W. Zumbusch. A sparse grid PDE solver. In H. P. Langtangen, A. M. Bruaset, and E. Quak, editors, Advances in Software Tools for Scientific Computing, volume 10 of Lecture Notes in Computational Science and Engineering, chapter 4, pages 133-177. Springer, Berlin, Germany, 2000. (Proceedings SciTools '98). [ bib | Zentralblatt Math | .ps.gz | .pdf ]
Sparse grids are an efficient approximation method for functions, especially in higher dimensions d >=3. Compared to regular, uniform grids of a mesh parameter h, which contain h-d points in d dimensions, sparse grids require only h-1|logh|d-1 points due to a truncated, tensor-product multi-scale basis representation. The purpose of this paper is to survey some activities for the solution of partial differential equations with methods based sparse grid. Furthermore some aspects of sparse grids are discussed such as adaptive grid refinement, parallel computing, a space-time discretization scheme and the structure of a code to implement these methods.

[42] G. W. Zumbusch. Parallel adaptively refined sparse grids. In E. Dick, K. Riemslagh, and J. Vierendeels, editors, Multigrid Methods VI, volume 14 of Lecture Notes in Computational Science and Engineering, pages 285-292. Springer, Berlin, Germany, 2000. (Proceedings EMG 6). [ bib | .ps.gz | .pdf ]
A parallel version of a finite difference discretization of PDEs on sparse grids is proposed. Sparse grids or hyperbolic crosspoints can be used for the efficient representation of solutions of a boundary value problem, especially in high dimensions, because the number of grid points depends only weakly on the dimension. So far only the `combination' technique for regular sparse grids was available on parallel computers. However, the new approach allows for arbitrary, adaptively refined sparse grids. The efficient parallelisation is based on a dynamic load-balancing approach with space-filling curves.

[43] A. Caglar, M. Griebel, M. A. Schweitzer, and G. Zumbusch. Dynamic load-balancing of hierarchical tree algorithms on a cluster of multiprocessor PCs and on the Cray T3E. In H. W. Meuer, editor, Proceedings 14th Supercomputer Conference, Mannheim, ISBN 3-932178-08-4, Mannheim, Germany, 1999. Mateo. SuParCup '99 Award Winning Paper, also as SFB 256 report 27. [ bib | .ps.gz | .pdf ]
The solution of many problems in science and engineering is based on computational kernels for the numerical treatment of partial differential equations (PDEs) or N-body problems. Traditional solution methods however reduce these to linear algebra or brute force algorithms on structured data sets. Larger and larger simulations require smarter algorithms to be tractable. Hierarchical tree algorithms represent such a class, both for PDEs and for N-body problems. However, their efficient parallelization is not straightforward. Some difficulties can be removed, if one can provide a fast dynamic load-balancing scheme to cope with the tree variations of the unstructured data sets. In this paper we propose a very cheap yet very efficient load-balancing scheme for tree algorithms based on space-filling curves. Furthermore we present the Parnass2 cluster, on which such parallel codes perform extremely well. The cluster consists of SMP PCs and a Myrinet network at Gigabit/s speed configured with full bisection bandwidth. It turns out that it does not only has the obvious price/performance advantage, but also an absolute performance, which is comparable to the latest commercial Cray T3E.

[44] M. Griebel and G. W. Zumbusch. Parallel multigrid in an adaptive PDE solver based on hashing and space-filling curves. Parallel Computing, 25:827-843, 1999. [ bib | .ps.gz | .pdf ]
Partial differential equations can be solved efficiently by adaptive multigrid methods on a parallel computer. We report on the concept of hash-table storage techniques to set up such a program. The code requires substantial less amount of memory than implementations based on tree type data structures and is easier to program in the sequential case. The parallelization takes place by a space-filling curve domain decomposition intimately connected to the hash table. The new data structure simplifies the parallelization of the code substantially and introduces a cheap way to solve the load balancing and mapping problem. We report on the main features of the method and give the results of numerical experiments with the new parallel solver on a cluster of 64 Pentium II/400MHz connected by a Myrinet in a fat tree topology.

[45] M. Griebel and G. W. Zumbusch. Adaptive sparse grids for hyperbolic conservation laws. In M. Fey and R. Jeltsch, editors, Hyperbolic Problems: Theory, Numerics, Applications. 7th International Conference in Zürich, February 1998, volume 1 of International Series of Numerical Mathematics 129, pages 411-422, Basel, Switzerland, 1999. Birkhäuser. [ bib | .ps.gz | .pdf ]
We report on numerical experiments using adaptive sparse grid discretization techniques for the numerical solution of scalar hyperbolic conservation laws. Sparse grids are an efficient approximation method for functions. Compared to regular, uniform grids of a mesh parameter h contain h-d points in d dimensions, sparse grids require only h-1|logh|d-1 points due to a truncated, tensor-product multi-scale basis representation.
For the treatment of conservation laws two different approaches are taken: First an explicit time-stepping scheme based on central differences is introduced. Sparse grids provide the representation of the solution at each time step and reduce the number of unknowns. Further reductions can be achieved with adaptive grid refinement and coarsening in space. Second, an upwind type sparse grid discretization in d+1 dimensional space-time is constructed. The problem is discretized both in space and in time, storing the solution at all time steps at once, which would be too expensive with regular grids. In order to deal with local features of the solution, adaptivity in space-time is employed. This leads to local grid refinement and local time-steps in a natural way.

[46] M. A. Schweitzer, G. W. Zumbusch, and M. Griebel. Parnass2: A cluster of dual-processor PCs. In W. Rehm and T. Ungerer, editors, Proceedings of the 2nd Workshop Cluster-Computing, Karlsruhe, number CSR-99-02 in Chemnitzer Informatik Berichte, pages 45-54, Chemnitz, Germany, 1999. TU Chemnitz. [ bib | .ps.gz | .pdf ]
We report on a cluster with 96 CPUs (48 Dual-Processor PCs running Linux 2.1.127) at our department, called Parnass2. The computing nodes are connected by a Myrinet, a Gigabit per second switched LAN, and additionally by a Fast-Ethernet. A comparison of different message passing (MPI) libraries for the Myrinet is given. Furthermore, we present the performance results of Parnass2 for some of the parallel codes developed at our department, namely a sparse grid code for PDEs, a particle code for molecular dynamics, a finite difference code for the Navier-Stokes equation, and a parallel adaptive finite element / finite difference multi-grid code. We compare these results with the performance of these codes running on a Cray T3E-1200, a Cray T3E-600, a SGI Origin 200/2000 and Parnass [?], a cluster of SGI O2 workstations.

[47] G. Zumbusch. Dynamic loadbalancing in a lightweight adaptive parallel multigrid PDE solver. In B. Hendrickson, K. Yelick, C. Bischof, I. Duff, A. Edelman, G. Geist, M. Heath, M. Heroux, C. Koelbel, R. Schrieber, R. Sinovec, and M. Wheeler, editors, Proceedings of 9th SIAM Conference on Parallel Processing for Scientific Computing (PP 99), San Antonio, Texas, ISBN 0-89871-435-4, page 10, Philadelphia, PA, 1999. SIAM. [ bib | .ps.gz | .pdf ]
A parallel version of an adaptive multigrid solver for partial differential equations is considered. The main emphasis is put on the load balancing algorithm to distribute the adaptive grids at runtime. The background and some applications of space-filling curves are discussed, which are later on used as the basic principle of the load-balancing heuristic. A tight integration of space-filling curves as a memory addressing scheme into the numerical algorithm is proposed. Some experiments on a cluster of PCs demonstrates the parallel efficiency and scalability of the approach.

[48] G. W. Zumbusch. A parallel adaptive multigrid method. In W. Hackbusch and S. Sauter, editors, Proceedings of the 15th GAMM-Seminar Kiel on Numerical Techniques for Composite Materials, Notes on Numerical Fluid Mechanics, Wiesbaden, Germany, 1999. Vieweg. submitted. [ bib | .ps.gz | .pdf ]
A parallel version of an adaptive multigrid solver for elliptic partial differential equations is described. It operates on a finite difference discretization on quad-tree and oct-tree meshes, which are obtained by adaptive mesh refinement. A fast parallel load balancing strategy for the parallel multigrid equation solver is proposed that is defined by a space-filling Hilbert curve and is applicable to arbitrary shaped domains. Some numerical experiments demonstrate the parallel efficiency and scalability of the approach.

[49] A. M. Bruaset, H. P. Langtangen, and G. W. Zumbusch. Domain decomposition and multilevel methods in Diffpack. In P. E. Bjørstad, M. S. Espedal, and D. E. Keyes, editors, Proceedings of Domain Decomposition Methods 9, DD9, pages 655-662, Bergen, Norway, 1998. Domain Decomposition Press. also as report STF42 F96017, Sintef Applied Mathematics, Oslo, 1996. [ bib | .ps.gz | .pdf ]
... Domain decomposition and multilevel methods contain a variety of more standard numerical building blocks (linear solvers, matrix assembly, interpolation of fields etc.). Successful software for complicated applications must offer the user a flexible run-time combination of all these different components. The purpose of the present paper is to describe how one can achieve such flexible software. In particular, we present a unified framework for domain decomposition and multilevel methods, and show how this framework can be efficiently implemented in existing software packages for PDEs.
The unified framework about to be presented is in part well known from the analysis of overlapping and non-overlapping methods [M. Dryja O.B. Widlund 1990 ], as well as from theory for overlapping and multilevel schemes [J. Xu 1992]. In this context, the goal of this paper is to extend the known framework to cover even more methods in common use, especially some Schur complement and nonlinear schemes. We will formulate the framework in a novel way that encourages systematic implementation of a wide class of domain decomposition and multilevel methods. Finally, we report on the experiences gathered from a particular implementation in the Diffpack software.

[50] M. Griebel and G. W. Zumbusch. Hash-storage techniques for adaptive multilevel solvers and their domain decomposition parallelization. In J. Mandel, C. Farhat, and X.-C. Cai, editors, Proceedings of Domain Decomposition Methods 10, DD10 (1997), number 218 in Contemporary Mathematics, pages 271-278, Providence, Rhode Island, 1998. AMS. [ bib | .ps.gz | .pdf ]
Partial differential equations can be solved efficiently by adaptive multigrid methods on a parallel computer. We report on the concepts of hash-table storage techniques and space-filling curves to set up such a code. The hash-table storage requires substantial less amount of memory and is easier to code than tree data structures used in traditional adaptive multigrid codes, already for the sequential case. The parallelization takes place by a domain decomposition by space filling curves, which are intimately connected to the hash table. The new data structure simplifies the parallel version of the code substantially and introduces a cheap way to solve the load balancing and mapping problem....

[51] M. Griebel and G. W. Zumbusch. Parallel multigrid in an adaptive PDE solver based on hashing. In E. D'Hollander, G.R. Joubert, F.J. Peters, and U. Trottenberg, editors, Parallel Computing: Fundamentals, Applications and New Directions, number 12 in Advances in Parallel Computing, pages 589-600, Amsterdam, The Netherlands, 1998. Elsevier. Proceedings of ParCo 97, Bonn, Germany. [ bib | .ps.gz | .pdf ]
Partial differential equations can be solved efficiently by adaptive multigrid methods on a parallel computer. We report on the concept of hash-table storage techniques to set up such a code. The code requires substantial less amount of memory and is easier to code in the sequential case. The parallelization takes place by a space filling curve domain decomposition intimately connected to the hash table. The new data structure simplifies the parallel version of the code substantially way and introduces a cheap way to solve the load balancing and mapping problem.

[52] T. Schiekofer and G. W. Zumbusch. Software concepts of a sparse grid finite difference code. In W. Hackbusch and G. Wittum, editors, Proceedings of the 14th GAMM-Seminar Kiel on Concepts of Numerical Software, Notes on Numerical Fluid Mechanics, page 11, Wiesbaden, Germany, 1998. Vieweg. submitted. [ bib | .ps.gz | .pdf ]
Sparse grids provide an efficient representation of discrete solutions of PDEs and are mainly based on specific tensor products of one-dimensional hierarchical basis functions. They easily allow for adaptive refinement and compression. We present special finite difference operators on sparse grids that possess nearly the same properties as full grid operators. Using this approach, partial differential equations of second order can be discretized straightforwardly. We report on an adaptive finite difference research code implementing this on sparse grids. It is structured in an object oriented way. It is based on hash storage techniques as a new data structure for sparse grids. Due to the direct access of arbitrary data traditional tree like structures can be avoided. The above techniques are employed for the solution of parabolic problems. We present a simple space-time discretization. Furthermore a time-stepping procedure for the solution of the Navier Stokes equations in 3D is presented. Here we discretize by a projection method and obtain Poisson problems and convection-diffusion problems.

[53] M. Griebel and G. W. Zumbusch. Parnass: Porting gigabit-LAN components to a workstation cluster. In W. Rehm, editor, Proceedings of the 1st Workshop Cluster-Computing, number CSR-97-05 in Chemnitzer Informatik Berichte, pages 101-124, Chemnitz, Germany, 1997. TU Chemnitz. also as Techn. Report No 19, SFB 256, Univ. Bonn. [ bib | .ps.gz | .pdf ]
We will report on a cluster of workstations at our department, called Parnass. It is based on different types of MIPS processor workstations and servers, connected by a Myrinet, a Gigabit per second switched LAN, and additionally a Fast Ethernet. We have ported some low level message passing libraries as well as MPI to the Myrinet. A comparison of the performance of various communication libraries on different networks will be presented.

[54] M. Korzen, R. Schriever, K.-U. Ziener, O. Paetsch, and G. W. Zumbusch. Real-time 3-d visualization of surface temperature fields measured by thermocouples on steel structures in fire engineering. In J. Ziebs, J. Bressers, H. Frenz, D. R. Hayhurst, H. Klingelhöffer, and S. Forest, editors, Proceedings of International Symposium Local Strain and Temperature Measurements in Non-Uniform Fields at Elevated Temperatures, pages 253-262, Camridge, UK, 1996. Woodhead Pub. [ bib | .ps.gz | .pdf ]
The aim of this paper is to present some advanced techniques for monitoring thermocouples in the field of fire engineering. In a fire test structural elements like columns, beams or slabs, which are insulated by a fire protection material, are subjected to mechanical as well as thermal loadings. Whereas the mechanical loading is constant the mean temperature in the furnace varies - due to oil or gas burners - nearly monotonically as a function of time within 90 minutes between room temperature and 1000 0C. New technical standards as well as research purposes require the monitoring of 30 to 60 thermocouples and more. Although versatile computer based data acquisition systems including necessary signal condition front-ends exist for handling such an amount of data at any required rate, there is a lack in representing these data during the test in their geometrical context, i.e. as a property of the steel surface. The method, which is proposed by the authors, uses some recent developments in computer graphics and numerical mathematics. By this method the monitoring of the thermocouples is understood as a representation of a time-dependent 1-dimensional field, which is based on discrete measured values on a curved surface in 3-D space. For this solution CAD and data visualization tools are under testing, which are originally designed for other purposes. In praxis a geometry file has to be created before the fire test for the structural element under consideration including the information on the position of the thermocouples. This file is used as an appropriate triangulation of the surface of the specimen. The corresponding grid together with the actual temperature readings are the basis for the real time visualization of the temperature field by continuous colors or iso-lines.

[55] G. W. Zumbusch. Multigrid methods in Diffpack. Technical Report STF42 F96016, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report gives an introduction to the multigrid iterative solvers in Diffpack. It is meant as a tutorial for the use of iterative solvers, preconditioners and nonlinear solvers based on multigrid methods. The first steps towards this efficient equation solvers are guided by a couple of examples and exercises. Since multigrid is a recipe to construct solution algorithms rather than black-box algorithms itself, there is lots of freedom for the user to tailor the actual solver. Reflecting this fact there are lots of possibilities to use the appropriate classes in Diffpack. Hence there is much advice needed not to get started, but also to use the methods efficiently. The exercises are meant to give some experience needed for applications and questions not covered in this introductory report.

[56] G. W. Zumbusch. Overlapping domain decomposition methods in Diffpack. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report gives an introduction to the overlapping domain decomposition solvers of Schwarz type in Diffpack. It is meant as a tutorial for the use of iterative solvers, preconditioners and nonlinear solvers based on overlapping Schwarz methods for partial differential equations. Additive Schwarz methods serve as a standard method for solving equation systems on parallel computers. They are also useful for computations on complicated domains constructed from simple domains where efficient equations solvers are available. We provide an introduction to the implementation and use of such methods in Diffpack. The first steps are guided by a couple of examples and exercises. We also want to refer to an accompanied tutorial on multigrid methods in Diffpack, which methods and codes are quite related.

[57] G. W. Zumbusch. Schur complement domain decomposition methods in Diffpack. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report gives an introduction to the Schur complement domain decomposition solvers in Diffpack. It is meant as a tutorial for the use of iterative solution methods of equation systems arising in the discretization of partial differential equations. Schur complement iterative solvers are discussed, without and with preconditioners. They are also referred to as iterative sub-structuring methods or non-overlapping domain decomposition methods. Domain decomposition methods are well suited and efficient equation solvers on parallel computers. Schur complement methods are also advantageous if there are abrupt changes in the coefficients of the differential operator due to abrupt changes in material properties. We provide an introduction to the implementation and use of such methods in Diffpack. We cover the basic Schur complement method along with preconditioners of eigen-decomposition, BPS, wire-basket and Neumann-Neumann type (with coarse grid). The first steps are guided by a couple of examples and exercises. We also want to refer to the related tutorials on overlapping domain decomposition [?] and on multigrid [?] methods in Diffpack.

[58] G. W. Zumbusch. Multigrid applied to different partial differential operators. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report is a continuation of an introductory report on the multigrid iterative solvers in Diffpack. We consider the solution of systems of equations as arising in linear elasticity, non-symmetric equations as in convection-diffusion problems, anisotropic operators and bad conditioned equations as for jumping coefficients problems. In the introductory report only the Laplacian and smooth coefficients were treated. The first steps are guided by a couple of examples and exercises.

[59] G. W. Zumbusch. Multigrid on different grids. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report is a continuation of an introductory report on the multigrid iterative solvers in Diffpack. We consider the solution of equation systems arizing in the finite element discretization of partial differential equations on different grids. In the introductory report only uniform partitions of the unit square and unit cube were treated. Now we consider also multigrid for mapped elements, grids generated by the meshing of super elements and unstructured (and non nested) grids. The first steps are guided by a couple of examples and exercises.

[60] G. W. Zumbusch. Multigrid for different finite difference equations. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report is a continuation of an introductory report on the multigrid iterative solvers for finite differences in Diffpack. We consider the solution of partial differential equations discretized by finite differences. We consider varying coefficient and anisotropic operators and a variety of strategies for the convection-diffusion equation and the biharmonic equation. In the introductory report only the Laplacian was treated. We also discuss different multigrid restriction and prolongation operators arising in some special multigrid versions. The first steps are guided by a couple of examples.

[61] G. W. Zumbusch. Multigrid methods for finite differences. Technical report, Sintef Applied Mathematics, Oslo, Norway, 1996. [ bib | .ps.gz | .pdf ]
The report serves as an alternative introductory report on the multigrid iterative solvers in Diffpack using finite differences instead of finite elements covered previously. We consider the solution of elliptic partial differential equations on different domains. We solve the resulting linear equation systems with a multigrid iteration or a Krylov iteration with a multigrid preconditioner. The multigrid restriction and prolongation are also implemented using finite “difference” type stencils. The first steps are guided by a couple of examples and exercises.

[62] G. W. Zumbusch. Simultanous h-p Adaptation in Multilevel Finite Elements. ISBN 3-8265-1136-0. Shaker, Aachen, Germany, 1996. Informatik, FU Berlin, 1995. [ bib | .ps.gz | .pdf ]
An important tool in engineering is the finite element method.... The combination of both methods, the h-p version, supplies the pre-asymptotic exponentially convergent p-version continuously with properly adapted grids. Hence it achieves the superior exponential convergence asymptotically, too, instead of algebraic convergence of its ingredients the h-version and the p-version. Although the first theoretical results claiming these convergence rates are quite classic, the number of codes using the h-p-version of finite elements is still rather limited. Reasons for that are the pure implementational complexity and the details, in conjunction with the rumor of engineers' low precision requirements. But the major reason is the lack of a robust (self-) adaptive control delivering the desired exponential convergence. ...
In the this thesis we present some steps towards an efficient implementation of the theoretically known exponential convergence. As it turns out, an efficient implementation requires additional theoretical considerations, which play a major role there as well. This includes both the fully automatic h-p-version and as a subset the p-version on suitable grids. We present some details concerning our approach implementing an adaptive h-p-version based on an adaptive multilevel h-version code named Kaskade. This software package uses unstructured grids of triangles in two dimensions and tetrahedra in three dimensions.

[63] G. W. Zumbusch. Symmetric hierarchical polynomials and the adaptive h-p-version. Houston Journal of Mathematics, pages 529-540, 1996. Proceedings of the Third International Conference on Spectral and High Order Methods, ICOSAHOM'95, also as report SC-95-18 ZIB, Berlin. [ bib | .ps.gz | .pdf ]
The h-p-version of finite-elements delivers a sub-exponential convergence in the energy norm. A step towards a full adaptive implementation is taken in the context of unstructured meshes of simplices with variable order p in space. Both assumptions lead to desirable properties of shape functions like symmetry, p-hierarchy and simple coupling of elements.
In a first step it is demonstrated that for standard polynomial vector spaces on simplices not all of these features can be obtained simultaneously. However, this is possible if these spaces are slightly extended or reduced. Thus a new class of polynomial shape functions is derived, which are especially well suited for three dimensional tetrahedra.
The construction is completed by directly minimizing the condition numbers of the arising preconditioned local finite element matrices. The preconditioner is based on two-step domain decomposition techniques using a multigrid solver for the global linear problem p=1 and direct solvers for local higher order problems.
Some numerical results concerning an adaptive (feedback) version of h-p finite elements are presented.

[64] Å. Ødegård and G. W. Zumbusch. Graphview - en java og cgi-pakke for 3d grafikk. Technical Report 9, Sintef Applied Mathematics, Oslo, Norway, 1996. STIM working notes. [ bib | .ps.gz | .pdf ]
I våre dager er det viktig å profilere seg gjennom Internett. En av de siste nyvinningene i denne forbindelse, er muligheten for å legge inn applikasjoner på Web-sider, som gir mulighet for interaksjon mellom Websiden og publikum. Dette har tidligere vært mulig ved hjelp av CGI, The Common Gateway Interface. Ved hjelp av denne teknologien kan man kjøre script, programmer etc. på Server, det vil si “hos seg selv,” relativi til hvem som har laget Websiden. Ved hjelp av slike verktøy, kan man hente ut informasjon, lage datasett ved hjelp av programmer ol. Den siste nyskapningen er imidlertid Java. Dette er et programeringsspråk spesialdesignet for å ha en dynamisk interaksjon med publikum på Web-sider. I motsetning til CGI, vil et Java program kjøre hos klient, det vil si lokalt hos den som ser på Web-siden. Man er dermed ikke begrenset av at data skal sendes over nettet. Dette gjør det for eksempel mulig å ha mus-interaksjon på Websider. Bakdelen er at man ikke kan ha optimalisert kode, med hensyn på arkitektur, siden koden skal kunne kjøres på mange forskjellige arkitekturer. Koden som kjøres er derfor“halv-kompilert”, slik at den må tolkes endelig av Web-brouseren hos klienten. Dette gjør at hastigheten ikke er svært stor, men absolutt brukbar. ...

[65] Ch. Schütte, M. Dinand, G. W. Zumbusch, and R. Brinkmann. Dynamics of Erbium-doped waveguide lasers: Modelling, reliable simulation, and comparison with experiments. Technical Report SC-95-19, Konrad-Zuse-Zentrum, Berlin, Germany, 1995. [ bib | .ps.gz | .pdf ]
A theoretical investigation of the dynamic properties of integrated optical Erbium doped waveguide lasers is presented. It includes the construction of a physical model and of numerical techniques which allow reliable simulations of the dynamical behaviour of the laser signal depending on essential parameters of the laser device and on its external, time-dependent pump radiation. Therefore, a physical theory is developed which describes the propagation of light and its interaction with the active substrate in the laser cavity. This is realized in two steps. First, a fundamental model based on Maxwell's equations and on rate equations for the transitions in the active medium is constructed. Since this turns out to prohibit reliable simulations, it is, in a second step, reformulated via averaging in time and space which suppresses the fluctuations on the fastest time scales but represents them correctly. For this reduced model reliable and efficient simulation techniques using adaptive control schemes are designed and implemented. We apply the linear implicit Euler discretization with extrapolation in time and a multilevel quadrature scheme in space. Finally the model is justified in comparison with experimental observations in four cases of technological relevance.

[66] G. W. Zumbusch. Adaptive h-p approximation procedures, graded meshes and anisotropic refinement for numerical quadrature. In F. Brezzi, J. Periaux, R. Glowinski, R. Rannacher, and Yu. Kuznetsov, editors, Proceedings of The First European Conference on Numerical Mathematics and Advanced Applications, ENUMATH 95, page 12, 1995. accepted, also as report SC-95-24 ZIB, Berlin. [ bib | .ps.gz | .pdf ]
A set of adaptive algorithms for quadrature on multi-dimensional polyhedral domains is presented. Several kinds of refinement are discussed, covering local improvement of quadrature order and splitting the domain into sub-domains, resulting in isotropic, graded or anisotropic grids. The algorithms are pure local heuristics using no a priori knowledge or tuning parameters. This approach was motivated by results from finite element theory for optimal approximation results. Numerical experiments show the optimality of pure local greedy-like algorithms for singularity-type functions typically occurring in finite element computations.

[67] G. W. Zumbusch. Simultanous h-p Adaptation in Multilevel Finite Elements. Doktorarbeit, Fachbereich Mathematik und Informatik, FU Berlin, 1995. [ bib | .ps.gz | .pdf ]
An important tool in engineering is the finite element method.... The combination of both methods, the h-p version, supplies the pre-asymptotic exponentially convergent p-version continuously with properly adapted grids. Hence it achieves the superior exponential convergence asymptotically, too, instead of algebraic convergence of its ingredients the h-version and the p-version. Although the first theoretical results claiming these convergence rates are quite classic, the number of codes using the h-p-version of finite elements is still rather limited. Reasons for that are the pure implementational complexity and the details, in conjunction with the rumor of engineers' low precision requirements. But the major reason is the lack of a robust (self-) adaptive control delivering the desired exponential convergence. ...
In the this thesis we present some steps towards an efficient implementation of the theoretically known exponential convergence. As it turns out, an efficient implementation requires additional theoretical considerations, which play a major role there as well. This includes both the fully automatic h-p-version and as a subset the p-version on suitable grids. We present some details concerning our approach implementing an adaptive h-p-version based on an adaptive multilevel h-version code named Kaskade. This software package uses unstructured grids of triangles in two dimensions and tetrahedra in three dimensions.

[68] G. W. Zumbusch. Visualizing functions of the h-p-version of finite elements. Technical Report TR-94-05, Konrad-Zuse-Zentrum, Berlin, Germany, 1994. [ bib | .ps.gz | .pdf ]
Results from finite-element-calculations are usually visualized by colored surface- and contour-line-plots or polygonal patches or simply displaced lines and grids. In computer graphics however more advanced techniques like texture-mapping and NURBS are well established and there exist efficient algorithms and implementations. We show that these techniques are not only easy to use, but form a very natural and efficient approach for visualization of higher order finite-element's solutions like in p- and h-p-version. Texture-mapping is useful for displaying vector-valued data, too.

[69] G. W. Zumbusch. Symmetric hierarchical polynomials for the h-p-version of finite elements. Technical Report SC-93-32, Konrad-Zuse-Zentrum, Berlin, Germany, 1993. [ bib | .ps.gz | .pdf ]
Adaptive numerical methods using the h-p-version of finite elements require special kinds of shape functions. Desirable properties of them are symmetry, hierarchy and simple coupling. In a first step it is demonstrated that for standard polynomial vector spaces not all of these features can be obtained simultaneously. However, this is possible if these spaces are extended. Thus a new class of polynomial shape functions is derived, which is well-suited for the p- and h-p-version of finite elements on unstructured simplices. The construction is completed by minimizing the condition numbers of the arising finite element matrices. The new shape functions are compared with standard functions widely used in the literature.

[70] G. W. Zumbusch. Adaptive parallele Multilevel-Methoden zur Lösung elliptischer Randwertprobleme. Diplomarbeit, Mathematisches Institut, TU München, Munich, Germany, 1992. [ bib | .ps.gz | .pdf ]
Multilevel Methoden sind die zur Zeit effizientesten Verfahren zur Lösung großer symmetrischer schwachbesetzter linearer Gleichungssysteme, die aus der Diskretisierung selbstadjungierter elliptischer Randwertprobleme mit Finiten-Elementen entstehen. Im folgenden wird die Parallelisierung eines von Bramble, Pasciak und Xu vorgeschlagenen vorkonditionierten Verfahrens der konjugierten Gradienten, eingebettet in ein adaptives Finite-Elemente-Programm wie etwa Kaskade, diskutiert. Dabei müssen zur effizienten Lastverteilung zusätzliche Forderungen an Triangulierungen, Finite-Elemente-Räume und Gittermanipulationsalgorithmen gestellt werden. Es werden Standardverfahren der Lastverteilung mit einem neuen Aufteilungsverfahren, das auf einem statistischen Ansatz beruht, verglichen. Mit einer hier vorgestellten gemischten Strategie der Aufteilung von Gitterpunkten und Dreiecken kann ein Gesamtverfahren von optimaler Ordnung erreicht werden. Die experimentellen Ergebnisse auf einigen Parallelrechnern zeigen eine hohe Übereinstimmung mit einem hergeleiteten Kostenfunktional und eine ideale Parallelisierbarkeit des Verfahrens. Unterschiede zu anderen bekannten Multilevelverfahren werden in den einzelnen Abschnitten herausgestellt.

[71] G. W. Zumbusch. Adaptive parallele Multilevel-Methoden zur Lösung elliptischer Randwertprobleme. Technical Report 342/19/91 A, SFB 342, TU München, Munich, Germany, 1991. [ bib | .ps.gz | .pdf ]
Multilevel Methoden sind die zur Zeit effizientesten Verfahren zur Lösung großer linearer Gleichungssysteme, die aus der Diskretisierung elliptischer Randwertprobleme entstehen. Der Aufwand, mit Mehrgitterverfahren oder auch mit multilevel vorkonditionierten Verfahren der konjugierten Gradienten (CG) im symmetrisch positiv definiten Fall ein Gleichungssystem bis auf Diskretisierungsgenauigkeit zu lösen, ist unter geeigneten Voraus- setzungen proportional zur Zahl der Unbekannten oder nur um einen logarithmischen Term höher. ...
Die Ausführungsgeschwindigkeit kann durch adaptive Verfeinerungs- techniken, die die Zahl der notwendigen Unbekannten reduzieren, erhöht werden. Dazu existieren vollständige Programmpakete, wie PLTMG [Bank] und Kaskade [Leinen], [Deuflhard, Leinen, Yserentant], die die Ordnung des eingebauten iterativen Lösers durch Gitterverwaltung, Verfeinerung und Gittermanipulation nicht verschlechtern. Die Ordnung des Lösungsverfahrens kann nur durch parallele Ausführung gesenkt werden. In Hinblick auf sehr große lineare Gleichungssysteme, wie sie insbesondere auch durch Randwertprobleme in drei Raumdimensionen entstehen, liegt es nahe, beide Techniken zu verbinden. Bei der Lastverteilung adaptiv, also dynamisch erzeugter Strukturen, können allerdings nicht mehr alle Vorraussetzungen an die Finiten-Elemente-Räume und alle Algorithmen zur Gittermanipulation vom sequentiellen Programm übernommen werden. Existierende Ansätze, wie [Fox & Otto], [Berger & Bokhari] und [Bastian] führen zu Verfahren, deren Ordnung höher als die des iterativen Lösers ist, und nutzen die Multilevel-Struktur der Gitter nicht aus. Ansätze zur Parallelisierung von Standard-Mehrgitterverfahren wie [Briggs, Hart, McCormick & Quinlan] oder von adaptiven Mehrgitterverfahren wie [Mierendorff] können in dieser Form nicht auf adaptive Verfahren angewendet werden, obwohl sie für regulär verfeinerte Gitter optimale Ergebnisse liefern. ...
Wir werden im folgenden Parallelrechner mit verteiltem Speicher und Message-Passing-Kommunikation und Parallelrechner mit gemeinsamem Speicher und Semaphor-Synchronisation verwenden, um ein multilevel- vorkonditioniertes CG-Verfahren so zu implementieren, daß die Eigenschaften des sequentiellen Programms, soweit möglich, erhalten bleiben, und gleichzeitig eine effiziente Parallelisierung erreicht wird.


This file was generated by bibtex2html 1.91.