Automatic Parallelization using Pipelining for Equation-Based Simulation Languages Håkan Lundvall by
Linköping Studies in Science and Technology Thesis No. 1381 Automatic Parallelization using Pipelining for Equation-Based Simulation Languages by Håkan Lundvall Submitted to Linköping Institute of Technology at Linköping University in partial fulfilment of the requirements for the degree of Licentiate of Engineering Department of Computer and Information Science Linköpings universitet SE-581 83 Linköping, Sweden Linköping 2008 Automatic Parallelization using Pipelining for Equation-Based Simulation Languages by Håkan Lundvall September 2008 ISBN 978-91-7393-799-3 Thesis No. 1381 ISSN 0280-7971 ABSTRACT During the most recent decades modern equation-based object-oriented modeling and simulation languages, such as Modelica, have become available. This has made it easier to build complex and more detailed models for use in simulation. To be able to simulate such large and complex systems it is sometimes not enough to rely on the ability of a compiler to optimize the simulation code and reduce the size of the underlying set of equations to speed up the simulation on a single processor. Instead we must look for ways to utilize the increasing number of processing units available in modern computers. However to gain any increased performance from a parallel computer the simulation program must be expressed in a way that exposes the potential parallelism to the computer. Doing this manually is not a simple task and most modelers are not experts in parallel computing. Therefore it is very appealing to let the compiler parallelize the simulation code automatically. This thesis investigates techniques of using automatic translation of models in typical equation based languages, such as Modelica, into parallel simulation code that enable high utilization of available processors in a parallel computer. The two main ideas investigated here are the following: first, to apply parallelization simultaneously to both the system equations and the numerical solver, and secondly. to use software pipelining to further reduce the time processors are kept waiting for the results of other processors. Prototype implementations of the investigated techniques have been developed as a part of the OpenModelica open source compiler for Modelica. The prototype has been used to evaluate the parallelization techniques by measuring the execution time of test models on a few parallel archtectures and to compare the results to sequential code as well as to the results achieved in earlier work. A measured speedup of 6.1 on eight processors on a shared memory machine has been reached. It still remains to evaluate the methods for a wider range of test models and parallel architectures. This work has been supported by the Swedish Foundation for Strategic Research (SSF) in the ECSEL graduate school, by Vinnova in the Safe and Secure Modeling and Simulation Project, and by the Swedish Research Council (VR). Acknowledgements I have learned a great deal during these years working towards this thesis and I have many people to thank for this. First, I would like to express my gratitude towards my supervisor Peter Fritzson for giving me the opportunity to work on this research and for the guidance and encouragement he has provided. I also want to thank my co-supervisors Christoph Kessler and Dag Fritzson for their support. Even though I live two hours drive from the colleagues at PELAB it has always been a pleasant drive knowing that interesting discussion and good friends await when I arrive. Many thanks to all the employees at PELAB for that. And I would especially like to thank Bodil for all the administrative help which has made the life easier for all the rest of us. Those days when I have not drove all the way to Linköping, Sogeti Sverige AB has provided a nice working atmosphere here in Örebro, as well as a computer to work with. My time has been divided evenly between the PhD studies and the work as a consultant at Sogeti and I wish to thank Atlas Copco Rock Drills AB for keeping me busy with assignments through out these years knowing that they can only get half of my attention. Finally I would like to thank my lovely wife Jenny and my wonderful daughters Emma and Lina for always being there and believing in me. Örebro, August 13, 2008 Håkan Lundvall i Table of Contents Part 1 Introduction 5 Chapter 1 Background and Introduction 6 1.1 1.2 1.3 1.4 1.5 1.6 6 7 8 9 9 9 Modeling and Simulation Related Work on Task Scheduling and Clustering Research Problem Contributions Thesis Structure List of Papers Chapter 2 Implementation – The OpenModelica Parallelizing Backend 11 2.1 2.2 2.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.4 2.4.1 2.4.2 2.5 11 12 13 14 17 18 18 20 21 21 22 22 23 Background Compiler Phases Modules of the OpenModelica Compiler DAELow TaskGraph TaskGraphExt ModPar/Schedule ModPar/Codegen ModPar/TaskGraph Simulation Runtime MPI-Based Runtime System Pthreads-Based Runtime System Additional Load Balancing and Memory Layout Ideas Bibliography 25 Part II Papers 27 Paper A 1 2 3 4 4.1 4.2 4.3 4.4 5 6 7 8 9 Event Handling in the OpenModelica Compiler and Runtime System 29 Introduction Conversion of a Modelica Model to a Hybrid DAE Simulation of Hybrid Models Based on Solving Hybrid DAEs The Form of the Hybrid DAE System Summary of Notation Continuous-Time Behavior Discrete-Time Behavior The Complete Hybrid DAE Hybrid DAE Solution Algorithm Varying Structure of the Active Part of the Hybrid DAE Finding Consistent Initial Values at Start or Restart Detecting Events during Continuous-time Simulation Crossing Functions 31 32 32 33 33 34 35 36 36 38 39 40 40 ii 10 11 12 13 13.1 13.2 14 15 16 Integrating the Numerical Solver with Event-Handling Code Generation Example Measurements and Evaluation Bouncing Ball Full Wave Rectifier Conclusions Acknowledgements References 42 44 45 47 47 48 50 50 50 Paper B Automatic Parallelization of Object Oriented Models Across Method and System 53 1 2 3 3.1 3.2 3.3 4 5 6 7 8 9 10 11 12 Paper C 1 2 2.1 3 3.1 3.2 3.3 4 5 6 7 8 Background – Introduction to Mathematical Modeling and Modelica The OpenModelica Open Source Implementation Approaches to Integrate Parallelism and Mathematical Models Automatic Parallelization of Mathematical Models Coarse-Grained Explicit Parallelization Using Computational Components Explicit Parallel Programming Combining Parallelization at Several Levels Pipelining the Task Graph Sorting Equations for Short Access Distance Scheduling Measurements Conclusion Future work Acknowledgements References Automatic Parallelization of Models using Pipeline Extraction from Combined RHS and Inlined Solvers Introduction Background on Mathematical Modeling and Modelica The OpenModelica Open Source Environment Approaches to Exploit Parallelism in Mathematical Models Automatic Fine-Grained Parallelization of Mathematical Models Coarse-Grained Explicit Parallelization Using Computational Components Explicit Parallel Programming Combining Parallelization from Several Levels of Abstraction Pipelining the Task Graph Sorting Equations for Short Access Distance Scheduling Measurements 55 57 57 58 59 59 59 62 62 63 64 65 65 66 66 69 71 72 74 75 75 75 76 76 77 77 79 81 iii 9 10 11 12 Conclusions Future work Acknowledgements References 85 85 85 86 Part 1 Introduction Chapter 1 Background and Introduction In this chapter we first give a short background and introduction to the research area investigated. We also state the research problem and list the contributions of this work. At the end of the chapter we present the structure of this thesis and list the papers presenting the results of the research. 1.1 Modeling and Simulation In many areas modeling and simulation becomes more and more important. In product development, for example, modeling and simulation is used to allow for shorter development time and more robust systems. Instead of embarking on a time consuming and expensive prototype construction, a model of the system is built and experiments are performed on this model. The model can be a physical model where some details of the real system has been left out, but usually the model is a mathematical one stored in a computer and the experiments carried out on the model are computer simulations. Running the simulations on a model instead of building the system can help us to answer about the system which would not otherwise be possible due to reasons such as: • • • • The experiments would be too expensive to perform on a real system. The investigated variable may not be accessible in the real system. Performing some experiments on the real system may be too dangerous. It may take a too long time to perform the experiment on the real system. In order to perform a simulation of a model on a computer we must have an executable program that performs all the calculations needed. When computer simulations first saw the light of day these programs were hand written in relatively low level languages like C or Fortran. Although hand coded simulation programs are still used and developed, more and more models are written in high level equation-based modeling languages such as Modelica, gPROMS, and VHDLAMS. One of the great benefits of these equation-based languages is that the mathematical models behind the simulations are usually given in equations and a lot of manual work of transforming the equations into assignment statements can be Related Work on Task Scheduling and Clustering 7 avoided. The same submodel can also be used in several different contexts where the dependent variables, i.e., the variables to solve for, are not necessarily the same. This greatly reduces development time and improves reusability. The information about which variables are dependent variables and which are independent variables in the equations is called the causality of the system. As the models grow larger and more accurate simulation results are desired, more computational power is needed. Although computers have become faster, recently most of the increased computational power of new computers can only be utilized with extensive usage of parallelism. This puts demands on the modeling languages used to allow the simulation programs to utilize parallelism either by introducing language constructs to allow the modeler to express parallelism explicitly in the models, but even better by automatically parallelizing the same models previously used in simulations on single CPU machines. 1.2 Related Work on Task Scheduling and Clustering When a mathematical model is translated into executable simulation code a task graph is generated which describes all dependencies between the tasks which puts restrictions on the order in which the tasks can be executed. Therefore it is important to have efficient algorithms that can schedule the tasks on a parallel computer. Much research has been done in this area. In this section we summarize some of the previous work on task scheduling and clustering. A classification of scheduling algorithms can be found in , which classifies the algorithms in a hierarchical manner. At the top level the algorithms are divided into local and global algorithms, where the local algorithms schedule the tasks on one processor whereas the global algorithms consider the case of multiple processors. The global algorithms are further divided into static and dynamic algorithms. In this work we explore static scheduling which means the scheduling takes place at compile time. Static scheduling can be done optimally or sub-optimally and sub-optimal scheduling is further divided into heuristic and approximate scheduling. Both the optimal and the approximate sub-optimal scheduling are divided into four different kinds: enumerative, graph-theory, mathematical programming, and queuing theory. The work in this thesis falls under the approximate scheduling category. There exist algorithms for automatically scheduling entire taskgraphs on a fixed number of processors, such as the Modified Critical Path (MCP) scheduling algorithm  and the Earliest Ready Task (ERT) algorithm . The problem with scheduling algorithms which solves the entire scheduling problem, including assigning each task to a processor, is the computational complexity of such algorithms. Therefore, a clustering algorithm is used to make the job easier for the scheduler. A cluster is a set of tasks that run on the same processor. After the clustering face, a smaller number of clusters can be assigned to the processors and the scheduler still need to determine the execution order within the processor, i.e., local scheduling. 8 Background and Introduction Existing clustering algorithms include the Internalization algorithm  and the Dominant Sequence Clustering (DSC) algorithm . The Internalization algorithm works by traversing all edges of the task graph and check whether assigning the two tasks to the same processor would lead to an increased total execution time. If this is not the case the edge is internalized and the two tasks are put in the same cluster. 1.3 Research Problem In this section we discuss the research problem of this work. In the research leading to this thesis the aim has been to investigate if other techniques to automatically translate models stated in equation-based languages into parallel code can lead to more efficient code compared to previously tested approaches. A second objective of this work has been to reduce the complexity of the translation algorithm in order to speed up the translation process. The research methodology used in this work is the traditional system oriented computer science research method. In order to validate our hypotheses, a prototype implementation is built. By running simulations of test models on parallel computers we measure the amount of time needed to complete the simulations. We can compare these measurements to test runs on a single processor and thereby calculate the gained speed-up. In order to compare the results to earlier work the test model used is chosen such that test data are available from the earlier work to which we compare our results. This work shares essentially the same objectives as the work presented in  by Peter Aronsson, but we continue a bit further and develop some new methods. Parallelizing the simulation of a mathematical model can be done at several levels, see Paper B, page 59. The calculation of the simulation results involves a numerical solver which iterates over the simulation problem in order to compute the result. Parallelization can be done at the numerical method level by using a parallel version of the solver or at the system level by using a sequential solver where each call to the right hand side of the system of equations is parallelized. One hypothesis explored in this work is that more speedup can be gained if parallelization is done at both of these levels at the same time. The methods developed in Aronssons thesis use a graph rewriting system to automatically merge tasks into larger units to increase the granularity of the task graph. This is done on the global full system task graph. A second hypothesis explored in this work is that compilation time can be reduced by assigning the tasks to processors early, guided by information about locality present in the model, to avoid expensive work on a large task graph, and instead later merge the tasks locally in each processor, to increase granularity. Contributions 9 1.4 Contributions The following are the main contributions of this work: • • • Contributions to the OpenModelica compiler (specifically the event handling mechanism) which is used in several research projects. This is described in Paper A, pages 29-50. A new variant of topological sorting of equations to allow for simple and fast scheduling with little communication. This is described in Paper B, pages 53-66. Task clustering after processor assignment to enable pipelining of simulations steps without unnecessary waiting times. This is described in Paper C, pages 69-85. 1.5 Thesis Structure This thesis has the following structure. Part I is divided into two chapters where the first gives an introduction including the research objective, some background on the Modelica Language and a summary of the contributions of this thesis. The second chapter describes the implementation of the prototype system developed as a part of this thesis work. Part II contains the papers which presents the research results of this work. 1.6 List of Papers The research results are presented in three papers in Part II of the thesis. The papers are; Paper A: Event Handling in the OpenModelica Compiler and Runtime System Håkan Lundvall, Peter Fritzson, and Bernhard Bachmann Edited version of a technical report published in Technical Reports in Computer and Information Science No. 2 Linköping Electronic Press. 2008. An earlier shorter version of this paper was published in the Proceedings of the 46th Conference on Simulation and Modelling of the Scandinavian Simulation Society (SIMS2005), Trondheim, Norway, October 13-14, 2005. 10 Background and Introduction Paper B Automatic Parallelization of Object Oriented Models Across Method and System Håkan Lundvall and Peter Fritzson Edited version of a paper originally published in Proceedings of 6th Eurosim Congress, Ljubljana, Slovenia, 2007. Paper C Automatic Parallelization of Models using Pipeline Extraction from Combined RHS and Inlined Solvers Håkan Lundvall and Peter Fritzson Edited version of a paper submitted to the 14th Workshop on Compilers for Parallel Computing, IBM Research Center, Zurich, Switzerland, January 79, 2009 Chapter 2 Implementation – The OpenModelica Parallelizing Backend In order to test and evaluate the methods and theories developed in this thesis work, which are described in Papers A to C, a prototype implementations of these methods have been developed as part of the OpenModelica compiler. This chapter describes the implementation of the prototype parallelizing backend for OpenModelica. 2.1 Background OpenModelica is an open source project aiming at creating a complete Modelica modeling, compilation and simulation environment. The current version includes the following parts: • • • • • A Modelica compiler (omc). An interactive shell (OMShell) which gives a command interface to the compiler. The OpenModelica Notebook (OMNotebook), a Mathematica style electronic notebook for Modelica where Modelica models can be written and simulated within the document. A Modelica Eclipse plugin (MDT – Modelica Development Tooling) The OpenModelica Development Environment (OMDev). The OpenModelica project was started at PELAB in 1998 and is now run within The Open Source Modelica Consortium . In earlier work on automatic parallelization of Modelica code, the ModPar system was developed and integrated into the OpenModelica compiler. 12 Implementation – The OpenModelica Parallelizing Backend Some code from the ModPar system has been reused also in this implementation to save implementation time. The ModPar system is divided into a number of different modules, some written in MetaModelica and some in C++. A description of the modules is presented in Section 2.3. 2.2 Compiler Phases In this section we provide a short overview of the different compiler phases involved when compiling a Modelica model, see also Figure 2-1. First a Modelica parser reads the Modelica source code and produces an abstract syntax tree (AST) that represents the Modelica model. In the next step a translator reorganizes the model into a flat list of equations and variables. This involves, among other things, handling inheritance, import statements and redeclarations. After the equations have been flattened the equations are topologically sorted according to the causal dependencies between the equations. This step involves symbolic manipulations such as algebraic simplifications, symbolic index reduction, and, removal of trivial equations. Once the causality has been established, the dependent variable in each equation, not part of a strong component, is solved for and the equations are converted to assignment statements. Equations that are part of a strong component must be solved simultaneously. Such systems of simultaneous equations are either solved symbolically at this stage, if possible, or left to a numerical solver. We now have an execution order in which the right hand side of the equations can be evaluated. In the last step C code is generated for all the assignment statements and a suitable C code representation (e.g. an RHS function) of the strongly connected equations is generated, which is then linked to a numerical solver. Modules of the OpenModelica Compiler 13 Modelica Source Code Modelica models Translator Flat model Analyzer Sorted equations Optimizer Optimized sorted equations Code Generator C Code C Compiler Executable Simulation Figure 2-1. OpenModelica compiler translation phases. The newly developed version of ModPar only influences the last few steps; equation sorting and code generation. An extra step has been added to the topological sorting to keep equations with dependencies closer together, and the code generation step is exchanged for a set of modules which generates a task graph and performs scheduling and generation of parallel code. In section 2.3 we give a more detailed explanation of the modules involved. 2.3 Modules of the OpenModelica Compiler The OpenModelica compiler is divided into a number of modules, see Figure 2-2. A description of the entire compiler and its modules can be found in the OpenModelica system documentation . Most of the modules are implemented in MetaModelica . The ModPar system adds a number of modules to the compiler. A list of the modules implementing the ModPar system is shown in table 2-1. For each module we have a section “Added Functionality” which summarizes the contributions to the implementation compared to the old ModPar version. 14 Implementation – The OpenModelica Parallelizing Backend Main Lookup SCode.Class (Env, name) Parse Absyn SCode /explode SCode Inst SCode.Exp Exp.Exp DAE (Exp.Exp, Types.Type) DAE Dump Flat Modelica DAELow CodeGen SimCodeGen Static C code (Env, name) Values.Value Ceval Figure 2-2. The most important modules of the OpenModelica Compiler (omc) with module connections and data flow (from the OpenModelica System Documentation). There are more than 40 modules in the compiler. When using the ModPar system the SimCodeGen module is exchanged for the ModPar system which takes over the task of generating simulation code. Some changes have also been made to the DAELow module which handles the topological sorting of the equations Module name Implementation language Specific to ModPar DAELow MetaModelica No TaskGraph MetaModelica Yes TaskGraphExt MetaModelica/C++ Yes ModPar/Schedule C++ Yes ModPar/Codegen C++ Yes ModPar/TaskGraph C++ Yes Table 2-1. List of modules implementing the ModPar system. 2.3.1 DAELow This module takes care of sorting of the equations according to data dependency and building the block lower triangular form of the system. Input to the module is Modules of the OpenModelica Compiler 15 the flattened set of equations. The module provides functions for matching of variables to solve for, to access equations and finding out the causality between the equations. 22.214.171.124 Main Functions Below follows explanations of the most important functions of the DAELow module: public function lower input DAE.DAElist lst; input Boolean add_dummy; output DAELow outDAELow; end lower; This function takes a list of discrete algebraic equations (DAE), which is the flattened model, and translates it into a different form, called DAELow defined in this module. The DAELow representation splits the DAE into equations and variables and further divides variables into known and unknown variables and the equations into simple and non-simple equations. The input variable add_dummy adds a state variable and an associated equation which is independent of all other variables. This is used if the model contains no state variables since the numeric solver requires at least one state variable. public function incidenceMatrix input DAELow inDAELow; output IncidenceMatrix outIncidenceMatrix; end incidenceMatrix; This function calculates the incidence matrix, i.e., which variables are present in each equation. This matrix has rows representing each equation and columns representing each unknown variable. A non-zero entry means that the (column) variable is referenced in the (row) equation. See also Paper C, page 77 and the following page. 16 Implementation – The OpenModelica Parallelizing Backend public function matchingAlgorithm input DAELow inDAELow; input IncidenceMatrix inIncidenceMatrix; input IncidenceMatrixT inIncidenceMatrixT; input MatchingOptions inMatchingOptions; output Integer[:] outAssignments1; output Integer[:] outAssignments2; output DAELow outDAELow3; output IncidenceMatrix outIncidenceMatrix4; output IncidenceMatrixT outIncidenceMatrixT5; end matchingAlgorithm; This function performs the matching algorithm, which is the first part of sorting the equations into BLT (Block Lower Triangular) form. The BLT form is discussed in Paper C on page 77. The matching algorithm finds a variable that is solved in each equation. But to also find out which equations form a block of equations, the second algorithm of the BLT sorting is run, finding strong components. This function returns the updated DAE which, in case of index reduction, has added equations and variables, and the incidence matrix. The variable outAssignments is returned as a vector of variable indices, as well as its inverse, i.e., which equation a variable is solved in as a vector of equation indices. The variable inMatchingOptions contain options given to the algorithm. • • Whether index reduction should be used or not. Whether the equation system is allowed to be underconstrained or not, which is used when generating code for initial equations. public function strongComponents input IncidenceMatrix inIncidenceMatrix1; input IncidenceMatrixT inIncidenceMatrixT2; input Integer[:] inAssignments1; input Integer[:] inAssignments2; output list<list<Integer>> components; end strongComponents; This is the second part of the BLT sorting. It takes the variable assignments and the incidence matrix as input and identifies strong components, i.e., subsystems of equations. The result is stored in the output result variable, a list of integer lists called components. The integers stored in the innermost list represent the equation numbers and are references to specific equations of the model. The result is a topological sorting of the dependencies between the equations, i.e., the equations can be solved in the order given by the outermost list. If the innermost list contains more than one equation, it constitutes a strongly connected component and the involved equations must be solved simultaneously. Modules of the OpenModelica Compiler 17 public function sortBlocksForShortAccessDistance input list<list<Integer>> comps; input Integer[:] matching; input IncidenceMatrix incidenceMatrix; output list<list<Integer>> outComps; end sortBlocksForShortAccessDistance; This function takes an already topologically sorted list of equations and rearranges the equations to bring equations with direct dependencies closer together. Pseudo code for this function can be found in Paper B, after page 62. 126.96.36.199 Added Functionality The main difference compared to the sequential version of the compiler is the sorting step introduced in the function sortBlocksForShortAccessDistance. This can greatly reduce the need for communication between the processors when we schedule the tasks based on the order of the equations given by this sorting step. 2.3.2 TaskGraph This module replaces the SimCodegen module when generating parallel code. In the standard OpenModelica compiler the SimCodegen module traverses the sorted equations as they come from the DAELow module and generates C code for the equations. When generating parallel code the TaskGraph module traverses the equations in the same way with the difference that instead of C code, a task graph is generated and extra tasks implementing a Runge-Kutta solver are interleaved in the generated task graph, i.e., a kind of inlining of the solver. 188.8.131.52 Main Functions The buildTaskgraph function is the main function and only public function of the TaskGraph module. Its job is to generate the task graph. public function buildTaskgraph input DAELow.DAELow inDAELow1; input Integer[:] inAssignments2; input Integer[:] inAssignments3; input list<list<Integer>> inComponets; end buildTaskgraph; It takes the sorted list of equations and the matching of equations to unknown variables as input. For each equation the variable matched to the equation by the matching algorithm is solved for so that the equation can be reorganized into an assignment statement. Then tasks are added for the right hand side of the assignment and one task for the result being the root of this generated subtree. 18 Implementation – The OpenModelica Parallelizing Backend When generating tasks for the right hand side expression, references to unknown variables are looked up in the already generated part of the task graph. Tasks for calculating all such references are already in the generated part of the task graph since the equations are traversed in data dependency order. All tasks are marked with the position in the topological sorting of the equation from which they were generated. The function does not have a return value. Instead the task graph is stored in the internal state of the C++ modules. Normally MetaModelica functions cannot have internal state, but calling C++ functions makes this possible. 184.108.40.206 Added Functionality This module now also generates tasks for the solver itself and not just for computing the right hand side of the simulated system. This means some extra variables are introduced to store intermediate results used by the solver tasks and the tasks calculating the right hand side are duplicated for each solver stage. 2.3.3 TaskGraphExt This module constitutes the interface between the MetaModelica part and the C++ part of the backend. It exposes functions for building the task graph and generating simulation code. Since this module and the rest of the ModPar system is implemented as a C++ library it can unlike MetaModelica modules, hold an internal state. The generated task graph is stored in this internal state of the C++ library and is accessible from all the C++ modules described below. 2.3.4 ModPar/Schedule This module implements the scheduling algorithm. Input to the module is the generated task graph and the number of processors. The result is one task list for each processor. The first thing the scheduler does is to split all the tasks into one set of tasks per processor to generate code for. The position of the equation in the topological sorting is used for this. In the next stage a topological sorting of the tasks within a processor must be established. This is what is discussed in Paper C. The order in which the tasks are used when generating the C code is established using a priority queue in which all tasks associated to a processor is stored. The queue is sorted on two properties: 1) task set index and 2) level. The task set index refers to which task set within a given processor the task has been assigned to and the level refers to the length of the path from the task to the ‘end task’. The ‘end task’ being a task with inbound edges from all the tasks representing the unknown variables, thus being the last task in any topological sorting. The task sets are discussed in Paper C. In short all tasks within one processor are divided into three sets executed in order and where all inbound Modules of the OpenModelica Compiler 19 communication to a processor is gathered to the boundary between the first set (Pa,i) and the second set (Pb,i) and where all outgoing communication is gathered to the boundary between the second set and the third set (Pc,i). To calculate these properties the task graph is traversed with three depth first traversals. The first one has a visitor function which sets the level attribute of each visited node to the maximum level of all child nodes incremented by one. The second one is a reverse depth first search which divides all nodes into three categories; 1. The node has at least one incoming edge from a task scheduled on a higher ranked processor or at least one parent that was already assigned to this category. This means that this node is dependent on results from the previous simulation step. 2. The node has at least one incoming edge from a task scheduled on a lower ranked processor or at least one parent was already put in this category. This means that some inbound communication from another processor must have preceded this task. 3. The node does not fit into either of the two categories above. In the third traversal the tasks are clustered into the three sets (Pa,i , Pb,i , and Pb,i). The visitor function of the third traversal works in the following way: • • • If the currently visited node is in category 3 and there is at least one child in category 1 or 2, then the node is inserted in Pa,i. If the currently visited node is in category 1 or 2 and there are at least one child assigned to different processor, then the node is inserted in Pb,i. Otherwise the currently visited node is inserted in Pc,i. 220.127.116.11 Main Functions The following are the main functions of the ModePar/Schedule module. Schedule(TaskGraph*,VertexID start, VertexID end, int nproc); Initializes the scheduler and performs the scheduling algorithm described above. TaskList * get_tasklist(int proc); Returns the sorted task list for a given processor. set<int>* getAssignedProcessors(VertexID v); Returns the set of tasks assigned to a given processor . bool isAssignedTo(VertexID v, int proc); Checks whether a given node in the task graph is assigned to a processor. 20 Implementation – The OpenModelica Parallelizing Backend 18.104.22.168 Added Functionality This module is almost completely rewritten since a totally different scheduling approach is used compared to the original ModPar system. The task merging code is replaced by a new scheduler that performs processor assignment by column index in the BLT matrix as given by the DAELow module and local scheduling of the tasks for each processor. The job of increasing the granularity of the code is moved from the global perspective into the local scheduling for each processor. 2.3.5 ModPar/Codegen This module generates parallel C code from the task graph along with the schedule provided by the modpar/Schedule module. The schedule provides one task queue for each processor to generate code for. One function is generated for each task queue and put in separate files so that each function is put in a separate object file together with the data array for the simulation variables. Before generating the simulation functions the tasks are traversed to collect data about all the communication needs. For each pair of processors between which messages need to be communicated, a set is kept where all variables communicated are stored. Before beginning to generate the tasks in Pc,i for processor a given processor i these communication sets are used to generate message receive code for all messages with processor i as destination. All messages between Pa,j and Pb,i, for two processors i and j, are merged into one single message. The same happens with the message send code after the tasks of Pb,i are generated. 22.214.171.124 Main Functions The following are the main functions in the ModPar/Codegen module. void initialize(TaskGraph*, Schedule *, ...); This function initializes the code generator. It takes the global task graph and the schedule as input parameters. Some arguments to the function have been omitted for brevity. void generateCode(char * init_code); This function uses the schedule and the task graph given in the initialization and generates C code. 126.96.36.199 Added Functionality In the version of ModPar that existed before this work started, there was only a runtime for MPI available. In that version the tasks reaching the code generator was merged into larger but fewer tasks. In a merged task all result variables of one large task was collected into one message, but there could be more than one message Simulation Runtime 21 between one pair of processors. In the new version the original small tasks are kept, instead they are clustered into three sets. This means that the message passing code has been rewritten. The simulation code for the different processors is now also generated into separate functions in order to be usable by the pthreads runtime. 2.3.6 ModPar/TaskGraph This module implements the task graph and provides primitives for accessing properties of the task graph nodes. The task graph is implemented using the Boost graph Library . 188.8.131.52 Main Functions The following are the main functions in the ModPar/TaskGraph module. EdgeID add_edge(VertexID parent, VertexID child, TaskGraph *tg, string *result=0, int prio=0); This function adds a vertex between two nodes in the task graph. The name of the result variable communicated from the parent is given in the result argument. In case of more than one incoming edge to a node there is a priority given in order to sort out which order to use the incoming data in the task. Other graph manipulating functions are available through the Boost graph library. 184.108.40.206 Added Functionality The nodes of the task graph have been equipped with a new attribute which stores information about which variable the task is involved in evaluating the result for. This information is used by the scheduler to assign the task to the appropriate processor. Otherwise this module has been kept largely the same. 2.4 Simulation Runtime When the simulation code has been generated it is linked together with a simulation runtime. The runtime is written in C++. Two different runtime systems have been developed. One for distributed memory using MPI  and one for shared memory using pthreads . 22 Implementation – The OpenModelica Parallelizing Backend 2.4.1 MPI-Based Runtime System In this section we give a short overview of the MPI-based runtime system. When running a parallel program using MPI all nodes run the same executable and the MPI system takes care of starting up the program on each node that should be used. Since the same code is run on all processors a switch statement takes care of calling the correct simulation code for each processor. Each task of the task graph generates a result, but if a task only represents a subexpression of one equation then there is no simulation variable from the model to store the result in, so a temporary variable is used. In the MPI-based runtime these temporary variables are allocated as a global array on the heap. This is due to the fact that in the previous version of ModPar it was possible that such a variable would be used in a message to another processor. In the new version, however, all tasks originating from the same equation always get assigned to the same processor. Messages to other processors are implemented using non-blocking MPI send and receive. Before sending all the variables involved in one message are copied to a send buffer, since they are not necessarily placed consecutively in memory. On the receiving end there is a corresponding receive buffer from which the values are copied to their final destination. The function containing the simulation code is called from within a loop which first updates the simulation time variable and after the simulation code is called stores the new values of all simulation variables in a result file. One result file is generated for each processor. In this way the time it takes to store the simulation results are also distributed over all the processors. The files must be concatenated afterwards to get the complete simulation result. 2.4.2 Pthreads-Based Runtime System In the pthreads runtime the entire simulation is running in the same process but it is divided into several threads. Since all threads share the same memory it is possible to just keep one array of simulation variables that all threads reference. For the communication we just have to make sure that the task that produces the value for a variable that is used in a task on another thread has completed before the receiving task starts. To accomplish we use a map indexed by the communication id consisting of the sending and receiving thread number. In this map we store how many simulation steps the sending thread has completed. The receiving thread performs a busy waiting loop until the sending thread has passed the point were the needed values are available and has increased the counter in the communication map. However, it turns out that the simulation goes a lot faster if we separate the simulation variables into one array per thread, see Paper C. Thus, in the latest version of the prototype each thread has its own copy of the simulation variable array and the communication mechanism is extended to copy the relevant variables between the copies. Additional Load Balancing and Memory Layout Ideas 23 The temporary variables used to store the results of subexpressions are generated as local variables. In this way the C compiler is able to eliminate them during optimization. The resulting executable looks the same as if the entire equation was generated as one single C expression instead of a series of assignment statements using only one operator per expression. 2.5 Additional Load Balancing and Memory Layout Ideas Some ideas have not yet found their way into the current ModPar prototype. In Paper C we mention the possibility of load balancing when the execution cost of all tasks are not known. The idea is to make an initial guess of assigning the tasks evenly over two processors and then run a small test run, adjust the initial guess based on the test run and then iterate until the tasks are balanced. Then we split each half in two and do the same thing for four processors. If this is done inside the compiler a large part of the compilation work can be reused for each test compilation so the compilation time should not suffer that much. This function is not implemented, but we made some small test manually, which indicates that a better load balancing can be achieved compared to just splitting the tasks evenly over the processors. In the current version of the prototype the simulation variables are laid out in memory in the order they appear in the model. It would be better if we made the memory layout of the variables according to the equation sorting. This would keep all variables used by one processor close in memory. We should also make sure that variables that are communicated between processes should be laid out adjacent to each other so that they need not be copied to a communication buffer before transmission to other processors. Bibliography                  Message Passing Interface Forum. MPI a message-passing interface standard. Technical Report UT-CS-94-230. 1994. B. Nicols, D. Buttlar, and J. P. Farrel. Pthreads Programming. O’Reilly & Associates, 1996. J Siek, LQ Lee, A Lumsdaine. The Boost Graph Library: User Guide and Reference Manual, Addison-Wesley, 2002. Peter Fritzson. Principles of Object-Oriented Modeling and Simulation with Modelica 2.1, 940 pp., ISBN 0-471-471631, Wiley-IEEE Press, 2004. See also www.openmodelica.org and book web page: www.mathcore.com/drModelica The Modelica Association. The Modelica Language Specification Version 3.0, September 2007. http://www.modelica.org. Peter Fritzson, et al. The OpenModelica Modeling, Simulation, and Software Development Environment. In Simulation News Europe, 44/45, December 2005 See also: http://www.ida.liu.se/projects/OpenModelica. Peter Aronsson. Automatic Parallelization of Equation-Based Simulation Programs. PhD thesis, Dissertation No. 1022, Dept, Computer and Information Science, Linköping University, Linköping, Sweden. Ernst Christen and Kenneth Bakalar. VHDL-AMS – A Hardware Descriptions Language for Analog and Mixed-Signal Applications. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 46(10):1263-1272, 1999. Paul Inigo Barton. The Modelling and Simulation of Combined Discrete/Continuous Processes. PhD thesis, Department of Chemical Engineering, Imperial Collage of Science, Technology and Medicine, London, UK, 1992. The OpenModelica System Documentation, , February 2008. www.ida.liu.se/projects/OpenModelica The Open Source Modelica Consortium (OSMC) http://www.ida.liu.se/labs/pelab/modelica/OpenSourceModelicaConsortium.html Peter Fritzson. Modelica Meta-Programming and Symbolic Transformations – MetaModelica Programming Guide. Draft version, www.openmodelica.org, June 2007. Thomas L. Casavant and Jon G. Kuhl. A taxonomy of scheduling in general-purpos distributed computing systems. Software Engineering, 14(2):141-154, 1988. C.S. Park, S.B. Choi. Multiprocessor Scheduling Algorithm Utilizing Linear Clustering of Directed Acyclic Graphs. In Parallel and Distributed Systems, Proceedings, 1997. C.Y. Lee, J.J. Hwang, Y.C. Chow, F.D Anger. Multiprocessor Scheduling with Interprocessor Communication Delays. Operating Research Letters, vol. 7(no. 3), 1988. V. Sarkar. Partitioning and Scheduling Parallel Programs for Multiprocessors. MIT Press, Cambridge, MA, 1989. T. Yang, A. Gerasoulis, DSC: Scheduling Parallel Tasks on Parallel and Distributed Systems, vol. 5(no. 9), 1994. Datum Date Avdelning, institution Division, department Institutionen för datavetenskap LINKÖPINGS UNIVERSITET Språk Language Department of Computer and Information Science Rapporttyp Report category Svenska/Swedish X Engelska/English X Licentiatavhandling ISBN ISRN Examensarbete C-uppsats D-uppsats 2008-09-26 978-91-7393-799-3 LiU-Tek-Lic-2008:39 Serietitel och serienummer Title of series, numbering ISSN 0280-7971 Övrig rapport Linköping Studies in Science and Technology URL för elektronisk version Thesis No. 1381 Titel Automatic Parallelization using Pipelining for Equation-Based Simulation Languages Författare Håkan Lundvall Sammanfattning Abstract During the most recent decades modern equation-based object-oriented modeling and simulation languages, such as Modelica, have become available. This has made it easier to build complex and more detailed models for use in simulation. To be able to simulate such large and complex systems it is sometimes not enough to rely on the ability of a compiler to optimize the simulation code and reduce the size of the underlying set of equations to speed up the simulation on a single processor. Instead we must look for ways to utilize the increasing number of processing units available in modern computers. However to gain any increased performance from a parallel computer the simulation program must be expressed in a way that exposes the potential parallelism to the computer. Doing this manually is not a simple task and most modelers are not experts in parallel computing. Therefore it is very appealing to let the compiler parallelize the simulation code automatically. This thesis investigates techniques of using automatic translation of models in typical equation based languages, such as Modelica, into parallel simulation code that enable high utilization of available processors in a parallel computer. The two main ideas investigated here are the following: first, to apply parallelization simultaneously to both the system equations and the numerical solver, and secondly. to use software pipelining to further reduce the time processors are kept waiting for the results of other processors. Prototype implementations of the investigated techniques have been developed as a part of the OpenModelica open source compiler for Modelica. The prototype has been used to evaluate the parallelization techniques by measuring the execution time of test models on a few parallel archtectures and to compare the results to sequential code as well as to the results achieved in earlier work. A measured speedup of 6.1 on eight processors on a shared memory machine has been reached. It still remains to evaluate the methods for a wider range of test models and parallel architectures. Nyckelord Automatic parallelization, Equation-Based Languages, Modelica Department of Computer and Information Science Linköpings universitet Linköping Studies in Science and Technology Faculty of Arts and Sciences - Licentiate Theses No 17 No 28 No 29 No 48 No 52 No 60 No 71 No 72 No 73 No 74 No 104 No 108 No 111 No 113 No 118 No 126 No 127 No 139 No 140 No 146 No 150 No 165 No 166 No 174 No 177 No 181 No 184 No 187 No 189 No 196 No 197 No 203 No 212 No 230 No 237 No 250 No 253 No 260 No 283 No 298 No 318 No 319 No 326 No 328 No 333 No 335 No 348 No 352 No 371 No 378 No 380 No 381 No 383 No 386 No 398 Vojin Plavsic: Interleaved Processing of Non-Numerical Data Stored on a Cyclic Memory. (Available at: FOA, Box 1165, S-581 11 Linköping, Sweden. FOA Report B30062E) Arne Jönsson, Mikael Patel: An Interactive Flowcharting Technique for Communicating and Realizing Algorithms, 1984. Johnny Eckerland: Retargeting of an Incremental Code Generator, 1984. Henrik Nordin: On the Use of Typical Cases for Knowledge-Based Consultation and Teaching, 1985. Zebo Peng: Steps Towards the Formalization of Designing VLSI Systems, 1985. Johan Fagerström: Simulation and Evaluation of Architecture based on Asynchronous Processes, 1985. Jalal Maleki: ICONStraint, A Dependency Directed Constraint Maintenance System, 1987. Tony Larsson: On the Specification and Verification of VLSI Systems, 1986. Ola Strömfors: A Structure Editor for Documents and Programs, 1986. Christos Levcopoulos: New Results about the Approximation Behavior of the Greedy Triangulation, 1986. Shamsul I. Chowdhury: Statistical Expert Systems - a Special Application Area for Knowledge-Based Computer Methodology, 1987. Rober Bilos: Incremental Scanning and Token-Based Editing, 1987. Hans Block: SPORT-SORT Sorting Algorithms and Sport Tournaments, 1987. Ralph Rönnquist: Network and Lattice Based Approaches to the Representation of Knowledge, 1987. Mariam Kamkar, Nahid Shahmehri: Affect-Chaining in Program Flow Analysis Applied to Queries of Programs, 1987. Dan Strömberg: Transfer and Distribution of Application Programs, 1987. Kristian Sandahl: Case Studies in Knowledge Acquisition, Migration and User Acceptance of Expert Systems, 1987. Christer Bäckström: Reasoning about Interdependent Actions, 1988. Mats Wirén: On Control Strategies and Incrementality in Unification-Based Chart Parsing, 1988. Johan Hultman: A Software System for Defining and Controlling Actions in a Mechanical System, 1988. Tim Hansen: Diagnosing Faults using Knowledge about Malfunctioning Behavior, 1988. Jonas Löwgren: Supporting Design and Management of Expert System User Interfaces, 1989. Ola Petersson: On Adaptive Sorting in Sequential and Parallel Models, 1989. Yngve Larsson: Dynamic Configuration in a Distributed Environment, 1989. Peter Åberg: Design of a Multiple View Presentation and Interaction Manager, 1989. Henrik Eriksson: A Study in Domain-Oriented Tool Support for Knowledge Acquisition, 1989. Ivan Rankin: The Deep Generation of Text in Expert Critiquing Systems, 1989. Simin Nadjm-Tehrani: Contributions to the Declarative Approach to Debugging Prolog Programs, 1989. Magnus Merkel: Temporal Information in Natural Language, 1989. Ulf Nilsson: A Systematic Approach to Abstract Interpretation of Logic Programs, 1989. Staffan Bonnier: Horn Clause Logic with External Procedures: Towards a Theoretical Framework, 1989. Christer Hansson: A Prototype System for Logical Reasoning about Time and Action, 1990. Björn Fjellborg: An Approach to Extraction of Pipeline Structures for VLSI High-Level Synthesis, 1990. Patrick Doherty: A Three-Valued Approach to Non-Monotonic Reasoning, 1990. Tomas Sokolnicki: Coaching Partial Plans: An Approach to Knowledge-Based Tutoring, 1990. Lars Strömberg: Postmortem Debugging of Distributed Systems, 1990. Torbjörn Näslund: SLDFA-Resolution - Computing Answers for Negative Queries, 1990. Peter D. Holmes: Using Connectivity Graphs to Support Map-Related Reasoning, 1991. Olof Johansson: Improving Implementation of Graphical User Interfaces for Object-Oriented KnowledgeBases, 1991. Rolf G Larsson: Aktivitetsbaserad kalkylering i ett nytt ekonomisystem, 1991. Lena Srömbäck: Studies in Extended Unification-Based Formalism for Linguistic Description: An Algorithm for Feature Structures with Disjunction and a Proposal for Flexible Systems, 1992. Mikael Pettersson: DML-A Language and System for the Generation of Efficient Compilers from Denotational Specification, 1992. Andreas Kågedal: Logic Programming with External Procedures: an Implementation, 1992. Patrick Lambrix: Aspects of Version Management of Composite Objects, 1992. Xinli Gu: Testability Analysis and Improvement in High-Level Synthesis Systems, 1992. Torbjörn Näslund: On the Role of Evaluations in Iterative Development of Managerial Support Sytems, 1992. Ulf Cederling: Industrial Software Development - a Case Study, 1992. Magnus Morin: Predictable Cyclic Computations in Autonomous Systems: A Computational Model and Implementation, 1992. Mehran Noghabai: Evaluation of Strategic Investments in Information Technology, 1993. Mats Larsson: A Transformational Approach to Formal Digital System Design, 1993. Johan Ringström: Compiler Generation for Parallel Languages from Denotational Specifications, 1993. Michael Jansson: Propagation of Change in an Intelligent Information System, 1993. Jonni Harrius: An Architecture and a Knowledge Representation Model for Expert Critiquing Systems, 1993. Per Österling: Symbolic Modelling of the Dynamic Environments of Autonomous Agents, 1993. Johan Boye: Dependency-based Groudness Analysis of Functional Logic Programs, 1993. No 402 No 406 No 414 No 417 No 436 No 437 No 440 FHS 3/94 FHS 4/94 No 441 No 446 No 450 No 451 No 452 No 455 FHS 5/94 No 462 No 463 No 464 No 469 No 473 No 475 No 476 No 478 FHS 7/95 No 482 No 488 No 489 No 497 No 498 No 503 FHS 8/95 FHS 9/95 No 513 No 517 No 518 No 522 No 538 No 545 No 546 FiF-a 1/96 No 549 No 550 No 557 No 558 No 561 No 563 No 567 No 575 No 576 No 587 No 589 No 591 No 595 No 597 Lars Degerstedt: Tabulated Resolution for Well Founded Semantics, 1993. Anna Moberg: Satellitkontor - en studie av kommunikationsmönster vid arbete på distans, 1993. Peter Carlsson: Separation av företagsledning och finansiering - fallstudier av företagsledarutköp ur ett agentteoretiskt perspektiv, 1994. Camilla Sjöström: Revision och lagreglering - ett historiskt perspektiv, 1994. Cecilia Sjöberg: Voices in Design: Argumentation in Participatory Development, 1994. Lars Viklund: Contributions to a High-level Programming Environment for a Scientific Computing, 1994. Peter Loborg: Error Recovery Support in Manufacturing Control Systems, 1994. Owen Eriksson: Informationssystem med verksamhetskvalitet - utvärdering baserat på ett verksamhetsinriktat och samskapande perspektiv, 1994. Karin Pettersson: Informationssystemstrukturering, ansvarsfördelning och användarinflytande - En komparativ studie med utgångspunkt i två informationssystemstrategier, 1994. Lars Poignant: Informationsteknologi och företagsetablering - Effekter på produktivitet och region, 1994. Gustav Fahl: Object Views of Relational Data in Multidatabase Systems, 1994. Henrik Nilsson: A Declarative Approach to Debugging for Lazy Functional Languages, 1994. Jonas Lind: Creditor - Firm Relations: an Interdisciplinary Analysis, 1994. Martin Sköld: Active Rules based on Object Relational Queries - Efficient Change Monitoring Techniques, 1994. Pär Carlshamre: A Collaborative Approach to Usability Engineering: Technical Communicators and System Developers in Usability-Oriented Systems Development, 1994. Stefan Cronholm: Varför CASE-verktyg i systemutveckling? - En motiv- och konsekvensstudie avseende arbetssätt och arbetsformer, 1994. Mikael Lindvall: A Study of Traceability in Object-Oriented Systems Development, 1994. Fredrik Nilsson: Strategi och ekonomisk styrning - En studie av Sandviks förvärv av Bahco Verktyg, 1994. Hans Olsén: Collage Induction: Proving Properties of Logic Programs by Program Synthesis, 1994. Lars Karlsson: Specification and Synthesis of Plans Using the Features and Fluents Framework, 1995. Ulf Söderman: On Conceptual Modelling of Mode Switching Systems, 1995. Choong-ho Yi: Reasoning about Concurrent Actions in the Trajectory Semantics, 1995. Bo Lagerström: Successiv resultatavräkning av pågående arbeten. - Fallstudier i tre byggföretag, 1995. Peter Jonsson: Complexity of State-Variable Planning under Structural Restrictions, 1995. Anders Avdic: Arbetsintegrerad systemutveckling med kalkylkprogram, 1995. Eva L Ragnemalm: Towards Student Modelling through Collaborative Dialogue with a Learning Companion, 1995. Eva Toller: Contributions to Parallel Multiparadigm Languages: Combining Object-Oriented and Rule-Based Programming, 1995. Erik Stoy: A Petri Net Based Unified Representation for Hardware/Software Co-Design, 1995. Johan Herber: Environment Support for Building Structured Mathematical Models, 1995. Stefan Svenberg: Structure-Driven Derivation of Inter-Lingual Functor-Argument Trees for Multi-Lingual Generation, 1995. Hee-Cheol Kim: Prediction and Postdiction under Uncertainty, 1995. Dan Fristedt: Metoder i användning - mot förbättring av systemutveckling genom situationell metodkunskap och metodanalys, 1995. Malin Bergvall: Systemförvaltning i praktiken - en kvalitativ studie avseende centrala begrepp, aktiviteter och ansvarsroller, 1995. Joachim Karlsson: Towards a Strategy for Software Requirements Selection, 1995. Jakob Axelsson: Schedulability-Driven Partitioning of Heterogeneous Real-Time Systems, 1995. Göran Forslund: Toward Cooperative Advice-Giving Systems: The Expert Systems Experience, 1995. Jörgen Andersson: Bilder av småföretagares ekonomistyrning, 1995. Staffan Flodin: Efficient Management of Object-Oriented Queries with Late Binding, 1996. Vadim Engelson: An Approach to Automatic Construction of Graphical User Interfaces for Applications in Scientific Computing, 1996. Magnus Werner : Multidatabase Integration using Polymorphic Queries and Views, 1996. Mikael Lind: Affärsprocessinriktad förändringsanalys - utveckling och tillämpning av synsätt och metod, 1996. Jonas Hallberg: High-Level Synthesis under Local Timing Constraints, 1996. Kristina Larsen: Förutsättningar och begränsningar för arbete på distans - erfarenheter från fyra svenska företag. 1996. Mikael Johansson: Quality Functions for Requirements Engineering Methods, 1996. Patrik Nordling: The Simulation of Rolling Bearing Dynamics on Parallel Computers, 1996. Anders Ekman: Exploration of Polygonal Environments, 1996. Niclas Andersson: Compilation of Mathematical Models to Parallel Code, 1996. Johan Jenvald: Simulation and Data Collection in Battle Training, 1996. Niclas Ohlsson: Software Quality Engineering by Early Identification of Fault-Prone Modules, 1996. Mikael Ericsson: Commenting Systems as Design Support—A Wizard-of-Oz Study, 1996. Jörgen Lindström: Chefers användning av kommunikationsteknik, 1996. Esa Falkenroth: Data Management in Control Applications - A Proposal Based on Active Database Systems, 1996. Niclas Wahllöf: A Default Extension to Description Logics and its Applications, 1996. Annika Larsson: Ekonomisk Styrning och Organisatorisk Passion - ett interaktivt perspektiv, 1997. Ling Lin: A Value-based Indexing Technique for Time Sequences, 1997. No 598 No 599 No 607 No 609 FiF-a 4 FiF-a 6 No 615 No 623 No 626 No 627 No 629 No 631 No 639 No 640 No 643 No 653 FiF-a 13 No 674 No 676 No 668 No 675 FiF-a 14 No 695 No 700 FiF-a 16 No 712 No 719 No 723 No 725 No 730 No 731 No 733 No 734 FiF-a 21 FiF-a 22 No 737 No 738 FiF-a 25 No 742 No 748 No 751 No 752 No 753 No 754 No 766 No 769 No 775 FiF-a 30 No 787 No 788 No 790 No 791 No 800 No 807 Rego Granlund: C3Fire - A Microworld Supporting Emergency Management Training, 1997. Peter Ingels: A Robust Text Processing Technique Applied to Lexical Error Recovery, 1997. Per-Arne Persson: Toward a Grounded Theory for Support of Command and Control in Military Coalitions, 1997. Jonas S Karlsson: A Scalable Data Structure for a Parallel Data Server, 1997. Carita Åbom: Videomötesteknik i olika affärssituationer - möjligheter och hinder, 1997. Tommy Wedlund: Att skapa en företagsanpassad systemutvecklingsmodell - genom rekonstruktion, värdering och vidareutveckling i T50-bolag inom ABB, 1997. Silvia Coradeschi: A Decision-Mechanism for Reactive and Coordinated Agents, 1997. Jan Ollinen: Det flexibla kontorets utveckling på Digital - Ett stöd för multiflex? 1997. David Byers: Towards Estimating Software Testability Using Static Analysis, 1997. Fredrik Eklund: Declarative Error Diagnosis of GAPLog Programs, 1997. Gunilla Ivefors: Krigsspel coh Informationsteknik inför en oförutsägbar framtid, 1997. Jens-Olof Lindh: Analysing Traffic Safety from a Case-Based Reasoning Perspective, 1997 Jukka Mäki-Turja:. Smalltalk - a suitable Real-Time Language, 1997. Juha Takkinen: CAFE: Towards a Conceptual Model for Information Management in Electronic Mail, 1997. Man Lin: Formal Analysis of Reactive Rule-based Programs, 1997. Mats Gustafsson: Bringing Role-Based Access Control to Distributed Systems, 1997. Boris Karlsson: Metodanalys för förståelse och utveckling av systemutvecklingsverksamhet. Analys och värdering av systemutvecklingsmodeller och dess användning, 1997. Marcus Bjäreland: Two Aspects of Automating Logics of Action and Change - Regression and Tractability, 1998. Jan Håkegård: Hiera rchical Test Architecture and Board-Level Test Controller Synthesis, 1998. Per-Ove Zetterlund: Normering av svensk redovisning - En studie av tillkomsten av Redovisningsrådets rekommendation om koncernredovisning (RR01:91), 1998. Jimmy Tjäder: Projektledaren & planen - en studie av projektledning i tre installations- och systemutvecklingsprojekt, 1998. Ulf Melin: Informationssystem vid ökad affärs- och processorientering - egenskaper, strategier och utveckling, 1998. Tim Heyer: COMPASS: Introduction of Formal Methods in Code Development and Inspection, 1998. Patrik Hägglund: Programming Languages for Computer Algebra, 1998. Marie-Therese Christiansson: Inter-organistorisk verksamhetsutveckling - metoder som stöd vid utveckling av partnerskap och informationssystem, 1998. Christina Wennestam: Information om immateriella resurser. Investeringar i forskning och utveckling samt i personal inom skogsindustrin, 1998. Joakim Gustafsson: Extending Temporal Action Logic for Ramification and Concurrency, 1998. Henrik André-Jönsson: Indexing time-series data using text indexing methods, 1999. Erik Larsson: High-Level Testability Analysis and Enhancement Techniques, 1998. Carl-Johan Westin: Informationsförsörjning: en fråga om ansvar - aktiviteter och uppdrag i fem stora svenska organisationers operativa informationsförsörjning, 1998. Åse Jansson: Miljöhänsyn - en del i företags styrning, 1998. Thomas Padron-McCarthy: Performance-Polymorphic Declarative Queries, 1998. Anders Bäckström: Värdeskapande kreditgivning - Kreditriskhantering ur ett agentteoretiskt perspektiv, 1998. Ulf Seigerroth: Integration av förändringsmetoder - en modell för välgrundad metodintegration, 1999. Fredrik Öberg: Object-Oriented Frameworks - A New Strategy for Case Tool Development, 1998. Jonas Mellin: Predictable Event Monitoring, 1998. Joakim Eriksson: Specifying and Managing Rules in an Active Real-Time Database System, 1998. Bengt E W Andersson: Samverkande informationssystem mellan aktörer i offentliga åtaganden - En teori om aktörsarenor i samverkan om utbyte av information, 1998. Pawel Pietrzak: Static Incorrectness Diagnosis of CLP (FD), 1999. Tobias Ritzau: Real-Time Reference Counting in RT-Java, 1999. Anders Ferntoft: Elektronisk affärskommunikation - kontaktkostnader och kontaktprocesser mellan kunder och leverantörer på producentmarknader,1999. Jo Skåmedal: Arbete på distans och arbetsformens påverkan på resor och resmönster, 1999. Johan Alvehus: Mötets metaforer. En studie av berättelser om möten, 1999. Magnus Lindahl: Bankens villkor i låneavtal vid kreditgivning till högt belånade företagsförvärv: En studie ur ett agentteoretiskt perspektiv, 2000. Martin V. Howard: Designing dynamic visualizations of temporal data, 1999. Jesper Andersson: Towards Reactive Software Architectures, 1999. Anders Henriksson: Unique kernel diagnosis, 1999. Pär J. Ågerfalk: Pragmatization of Information Systems - A Theoretical and Methodological Outline, 1999. Charlotte Björkegren: Learning for the next project - Bearers and barriers in knowledge transfer within an organisation, 1999. Håkan Nilsson: Informationsteknik som drivkraft i granskningsprocessen - En studie av fyra revisionsbyråer, 2000. Erik Berglund: Use-Oriented Documentation in Software Development, 1999. Klas Gäre: Verksamhetsförändringar i samband med IS-införande, 1999. Anders Subotic: Software Quality Inspection, 1999. Svein Bergum: Managerial communication in telework, 2000. No 809 FiF-a 32 No 808 No 820 No 823 No 832 FiF-a 34 No 842 No 844 FiF-a 37 FiF-a 40 FiF-a 41 No. 854 No 863 No 881 No 882 No 890 FiF-a 47 No 894 No 906 No 917 No 916 FiF-a-49 FiF-a-51 No 919 No 915 No 931 No 933 No 938 No 942 No 956 FiF-a 58 No 964 No 973 No 958 FiF-a 61 No 985 No 982 No 989 No 990 No 991 No 999 No 1000 No 1001 No 988 FiF-a 62 No 1003 No 1005 No 1008 No 1010 No 1015 No 1018 No 1022 FiF-a 65 Flavius Gruian: Energy-Aware Design of Digital Systems, 2000. Karin Hedström: Kunskapsanvändning och kunskapsutveckling hos verksamhetskonsulter - Erfarenheter från ett FOU-samarbete, 2000. Linda Askenäs: Affärssystemet - En studie om teknikens aktiva och passiva roll i en organisation, 2000. Jean Paul Meynard: Control of industrial robots through high-level task programming, 2000. Lars Hult: Publika Gränsytor - ett designexempel, 2000. Paul Pop: Scheduling and Communication Synthesis for Distributed Real-Time Systems, 2000. Göran Hultgren: Nätverksinriktad Förändringsanalys - perspektiv och metoder som stöd för förståelse och utveckling av affärsrelationer och informationssystem, 2000. Magnus Kald: The role of management control systems in strategic business units, 2000. Mikael Cäker: Vad kostar kunden? Modeller för intern redovisning, 2000. Ewa Braf: Organisationers kunskapsverksamheter - en kritisk studie av ”knowledge management”, 2000. Henrik Lindberg: Webbaserade affärsprocesser - Möjligheter och begränsningar, 2000. Benneth Christiansson: Att komponentbasera informationssystem - Vad säger teori och praktik?, 2000. Ola Pettersson: Deliberation in a Mobile Robot, 2000. Dan Lawesson: Towards Behavioral Model Fault Isolation for Object Oriented Control Systems, 2000. Johan Moe: Execution Tracing of Large Distributed Systems, 2001. Yuxiao Zhao: XML-based Frameworks for Internet Commerce and an Implementation of B2B e-procurement, 2001. Annika Flycht-Eriksson: Domain Knowledge Management inInformation-providing Dialogue systems, 2001. Per-Arne Segerkvist: Webbaserade imaginära organisationers samverkansformer: Informationssystemarkitektur och aktörssamverkan som förutsättningar för affärsprocesser, 2001. Stefan Svarén: Styrning av investeringar i divisionaliserade företag - Ett koncernperspektiv, 2001. Lin Han: Secure and Scalable E-Service Software Delivery, 2001. Emma Hansson: Optionsprogram för anställda - en studie av svenska börsföretag, 2001. Susanne Odar: IT som stöd för strategiska beslut, en studie av datorimplementerade modeller av verksamhet som stöd för beslut om anskaffning av JAS 1982, 2002. Stefan Holgersson: IT-system och filtrering av verksamhetskunskap - kvalitetsproblem vid analyser och beslutsfattande som bygger på uppgifter hämtade från polisens IT-system, 2001. Per Oscarsson:Informationssäkerhet i verksamheter - begrepp och modeller som stöd för förståelse av informationssäkerhet och dess hantering, 2001. Luis Alejandro Cortes: A Petri Net Based Modeling and Verification Technique for Real-Time Embedded Systems, 2001. Niklas Sandell: Redovisning i skuggan av en bankkris - Värdering av fastigheter. 2001. Fredrik Elg: Ett dynamiskt perspektiv på individuella skillnader av heuristisk kompetens, intelligens, mentala modeller, mål och konfidens i kontroll av mikrovärlden Moro, 2002. Peter Aronsson: Automatic Parallelization of Simulation Code from Equation Based Simulation Languages, 2002. Bourhane Kadmiry: Fuzzy Control of Unmanned Helicopter, 2002. Patrik Haslum: Prediction as a Knowledge Representation Problem: A Case Study in Model Design, 2002. Robert Sevenius: On the instruments of governance - A law & economics study of capital instruments in limited liability companies, 2002. Johan Petersson: Lokala elektroniska marknadsplatser - informationssystem för platsbundna affärer, 2002. Peter Bunus: Debugging and Structural Analysis of Declarative Equation-Based Languages, 2002. Gert Jervan: High-Level Test Generation and Built-In Self-Test Techniques for Digital Systems, 2002. Fredrika Berglund: Management Control and Strategy - a Case Study of Pharmaceutical Drug Development, 2002. Fredrik Karlsson: Meta-Method for Method Configuration - A Rational Unified Process Case, 2002. Sorin Manolache: Schedulability Analysis of Real-Time Systems with Stochastic Task Execution Times, 2002. Diana Szentiványi: Performance and Availability Trade-offs in Fault-Tolerant Middleware, 2002. Iakov Nakhimovski: Modeling and Simulation of Contacting Flexible Bodies in Multibody Systems, 2002. Levon Saldamli: PDEModelica - Towards a High-Level Language for Modeling with Partial Differential Equations, 2002. Almut Herzog: Secure Execution Environment for Java Electronic Services, 2002. Jon Edvardsson: Contributions to Program- and Specification-based Test Data Generation, 2002 Anders Arpteg: Adaptive Semi-structured Information Extraction, 2002. Andrzej Bednarski: A Dynamic Programming Approach to Optimal Retargetable Code Generation for Irregular Architectures, 2002. Mattias Arvola: Good to use! : Use quality of multi-user applications in the home, 2003. Lennart Ljung: Utveckling av en projektivitetsmodell - om organisationers förmåga att tillämpa projektarbetsformen, 2003. Pernilla Qvarfordt: User experience of spoken feedback in multimodal interaction, 2003. Alexander Siemers: Visualization of Dynamic Multibody Simulation With Special Reference to Contacts, 2003. Jens Gustavsson: Towards Unanticipated Runtime Software Evolution, 2003. Calin Curescu: Adaptive QoS-aware Resource Allocation for Wireless Networks, 2003. Anna Andersson: Management Information Systems in Process-oriented Healthcare Organisations, 2003. Björn Johansson: Feedforward Control in Dynamic Situations, 2003. Traian Pop: Scheduling and Optimisation of Heterogeneous Time/Event-Triggered Distributed Embedded Systems, 2003. Britt-Marie Johansson: Kundkommunikation på distans - en studie om kommunikationsmediets betydelse i affärstransaktioner, 2003. No 1024 No 1034 No 1033 FiF-a 69 No 1049 No 1052 No 1054 FiF-a 71 No 1055 No 1058 FiF-a 73 No 1079 No 1084 FiF-a 74 No 1094 No 1095 No 1099 No 1110 No 1116 FiF-a 77 No 1126 No 1127 No 1132 No 1130 No 1138 No 1149 No 1156 No 1162 No 1165 FiF-a 84 No 1166 No 1167 No 1168 FiF-a 85 No 1171 FiF-a 86 No 1172 No 1183 No 1184 No 1185 No 1190 No 1191 No 1192 No 1194 No 1204 No 1206 No 1207 No 1209 No 1225 No 1228 No 1229 No 1231 No 1233 No 1244 No 1248 No 1263 FiF-a 90 No 1272 Aleksandra Tešanovic: Towards Aspectual Component-Based Real-Time System Development, 2003. Arja Vainio-Larsson: Designing for Use in a Future Context - Five Case Studies in Retrospect, 2003. Peter Nilsson: Svenska bankers redovisningsval vid reservering för befarade kreditförluster - En studie vid införandet av nya redovisningsregler, 2003. Fredrik Ericsson: Information Technology for Learning and Acquiring of Work Knowledge, 2003. Marcus Comstedt: Towards Fine-Grained Binary Composition through Link Time Weaving, 2003. Åsa Hedenskog: Increasing the Automation of Radio Network Control, 2003. Claudiu Duma: Security and Efficiency Tradeoffs in Multicast Group Key Management, 2003. Emma Eliason: Effektanalys av IT-systems handlingsutrymme, 2003. Carl Cederberg: Experiments in Indirect Fault Injection with Open Source and Industrial Software, 2003. Daniel Karlsson: Towards Formal Verification in a Component-based Reuse Methodology, 2003. Anders Hjalmarsson: Att etablera och vidmakthålla förbättringsverksamhet - behovet av koordination och interaktion vid förändring av systemutvecklingsverksamheter, 2004. Pontus Johansson: Design and Development of Recommender Dialogue Systems, 2004. Charlotte Stoltz: Calling for Call Centres - A Study of Call Centre Locations in a Swedish Rural Region, 2004. Björn Johansson: Deciding on Using Application Service Provision in SMEs, 2004. Genevieve Gorrell: Language Modelling and Error Handling in Spoken Dialogue Systems, 2004. Ulf Johansson: Rule Extraction - the Key to Accurate and Comprehensible Data Mining Models, 2004. Sonia Sangari: Computational Models of Some Communicative Head Movements, 2004. Hans Nässla: Intra-Family Information Flow and Prospects for Communication Systems, 2004. Henrik Sällberg: On the value of customer loyalty programs - A study of point programs and switching costs, 2004. Ulf Larsson: Designarbete i dialog - karaktärisering av interaktionen mellan användare och utvecklare i en systemutvecklingsprocess, 2004. Andreas Borg: Contribution to Management and Validation of Non-Functional Requirements, 2004. Per-Ola Kristensson: Large Vocabulary Shorthand Writing on Stylus Keyboard, 2004. Pär-Anders Albinsson: Interacting with Command and Control Systems: Tools for Operators and Designers, 2004. Ioan Chisalita: Safety-Oriented Communication in Mobile Networks for Vehicles, 2004. Thomas Gustafsson: Maintaining Data Consistency im Embedded Databases for Vehicular Systems, 2004. Vaida Jakoniené: A Study in Integrating Multiple Biological Data Sources, 2005. Abdil Rashid Mohamed: High-Level Techniques for Built-In Self-Test Resources Optimization, 2005. Adrian Pop: Contributions to Meta-Modeling Tools and Methods, 2005. Fidel Vascós Palacios: On the information exchange between physicians and social insurance officers in the sick leave process: an Activity Theoretical perspective, 2005. Jenny Lagsten: Verksamhetsutvecklande utvärdering i informationssystemprojekt, 2005. Emma Larsdotter Nilsson: Modeling, Simulation, and Visualization of Metabolic Pathways Using Modelica, 2005. Christina Keller: Virtual Learning Environments in higher education. A study of students’ acceptance of educational technology, 2005. Cécile Åberg: Integration of organizational workflows and the Semantic Web, 2005. Anders Forsman: Standardisering som grund för informationssamverkan och IT-tjänster - En fallstudie baserad på trafikinformationstjänsten RDS-TMC, 2005. Yu-Hsing Huang: A systemic traffic accident model, 2005. Jan Olausson: Att modellera uppdrag - grunder för förståelse av processinriktade informationssystem i transaktionsintensiva verksamheter, 2005. Petter Ahlström: Affärsstrategier för seniorbostadsmarknaden, 2005. Mathias Cöster: Beyond IT and Productivity - How Digitization Transformed the Graphic Industry, 2005. Åsa Horzella: Beyond IT and Productivity - Effects of Digitized Information Flows in Grocery Distribution, 2005. Maria Kollberg: Beyond IT and Productivity - Effects of Digitized Information Flows in the Logging Industry, 2005. David Dinka: Role and Identity - Experience of technology in professional settings, 2005. Andreas Hansson: Increasing the Storage Capacity of Recursive Auto-associative Memory by Segmenting Data, 2005. Nicklas Bergfeldt: Towards Detached Communication for Robot Cooperation, 2005. Dennis Maciuszek: Towards Dependable Virtual Companions for Later Life, 2005. Beatrice Alenljung: Decision-making in the Requirements Engineering Process: A Human-centered Approach, 2005 Anders Larsson: System-on-Chip Test Scheduling and Test Infrastructure Design, 2005. John Wilander: Policy and Implementation Assurance for Software Security, 2005. Andreas Käll: Översättningar av en managementmodell - En studie av införandet av Balanced Scorecard i ett landsting, 2005. He Tan: Aligning and Merging Biomedical Ontologies, 2006. Artur Wilk: Descriptive Types for XML Query Language Xcerpt, 2006. Per Olof Pettersson: Sampling-based Path Planning for an Autonomous Helicopter, 2006. Kalle Burbeck: Adaptive Real-time Anomaly Detection for Safeguarding Critical Networks, 2006. Daniela Mihailescu: Implementation Methodology in Action: A Study of an Enterprise Systems Implementation Methodology, 2006. Jörgen Skågeby: Public and Non-public gifting on the Internet, 2006. Karolina Eliasson: The Use of Case-Based Reasoning in a Human-Robot Dialog System, 2006. Misook Park-Westman: Managing Competence Development Programs in a Cross-Cultural OrganisationWhat are the Barriers and Enablers, 2006. Amra Halilovic: Ett praktikperspektiv på hantering av mjukvarukomponenter, 2006. Raquel Flodström: A Framework for the Strategic Management of Information Technology, 2006. No 1277 No 1283 FiF-a 91 No 1286 No 1293 No 1302 No 1303 No 1305 No 1306 No 1307 No 1309 No 1312 No 1313 No 1317 No 1320 No 1323 No 1329 No 1331 No 1332 No 1333 No 1337 No 1339 No 1351 No 1353 No 1356 No 1359 No 1361 No 1363 No 1371 No 1373 No 1381 Viacheslav Izosimov: Scheduling and Optimization of Fault-Tolerant Embedded Systems, 2006. Håkan Hasewinkel: A Blueprint for Using Commercial Games off the Shelf in Defence Training, Education and Research Simulations, 2006. Hanna Broberg: Verksamhetsanpassade IT-stöd - Designteori och metod, 2006. Robert Kaminski: Towards an XML Document Restructuring Framework, 2006 Jiri Trnka: Prerequisites for data sharing in emergency management, 2007. Björn Hägglund: A Framework for Designing Constraint Stores, 2007. Daniel Andreasson: Slack-Time Aware Dynamic Routing Schemes for On-Chip Networks, 2007. Magnus Ingmarsson: Modelling User Tasks and Intentions for Service Discovery in Ubiquitous Computing, 2007. Gustaf Svedjemo: Ontology as Conceptual Schema when Modelling Historical Maps for Database Storage, 2007. Gianpaolo Conte: Navigation Functionalities for an Autonomous UAV Helicopter, 2007. Ola Leifler: User-Centric Critiquing in Command and Control: The DKExpert and ComPlan Approaches, 2007. Henrik Svensson: Embodied simulation as off-line representation, 2007. Zhiyuan He: System-on-Chip Test Scheduling with Defect-Probability and Temperature Considerations, 2007. Jonas Elmqvist: Components, Safety Interfaces and Compositional Analysis, 2007. Håkan Sundblad: Question Classification in Question Answering Systems, 2007. Magnus Lundqvist: Information Demand and Use: Improving Information Flow within Small-scale Business Contexts, 2007. Martin Magnusson: Deductive Planning and Composite Actions in Temporal Action Logic, 2007. Mikael Asplund: Restoring Consistency after Network Partitions, 2007. Martin Fransson: Towards Individualized Drug Dosage - General Methods and Case Studies, 2007. Karin Camara: A Visual Query Language Served by a Multi-sensor Environment, 2007. David Broman: Safety, Security, and Semantic Aspects of Equation-Based Object-Oriented Languages and Environments, 2007. Mikhail Chalabine: Invasive Interactive Parallelization, 2007. Susanna Nilsson: A Holistic Approach to Usability Evaluations of Mixed Reality Systems, 2008. Shanai Ardi: A Model and Implementation of a Security Plug-in for the Software Life Cycle, 2008. Erik Kuiper: Mobility and Routing in a Delay-tolerant Network of Unmanned Aerial Vehicles, 2008. Jana Rambusch: Situated Play, 2008. Martin Karresand: Completing the Picture - Fragments and Back Again, 2008. Per Nyblom: Dynamic Abstraction for Interleaved Task Planning and Execution, 2008. Fredrik Lantz:Terrain Object Recognition and Context Fusion for Decision Support, 2008. Martin Östlund: Assistance Plus: 3D-mediated Advice-giving on Pharmaceutical Products, 2008. Håkan Lundvall: Automatic Parallelization using Pipelining for Equation-Based Simulation Languages, 2008.