Modelling and the Modelling Cycle

Within this module, we define a model as "a purposeful abstract, simplified mathematical representation of some real system". They are used to solve problems and answer questions about a class of system.

Using models, we can hopefully explain observed patterns and predict responses to changes in the system. We use models as an abstract representation of a real system, as the real system is typically to complex or costly to implement.

Whilst modelling is used for predicting things, it is not perfect as uncertainty can start to play a large part as time goes on. Modelling can also allow us to formalise an understanding of a system, to see where there are gaps in this, and where is useful to start looking for research.

Modelling, as in this module, is also an important learning tool.

What is a Model

When building a model, we need to formulate design assumptions and rules. These assumptions are abstractions or simplifications of the real world, and the rules are the equations and algorithms that the system is governed by.

Another important thing is non-uniqueness, as there are many different ways to represent systems in a simplified way. We need to decide what aspects are useful to include in the model, and which parameters we can ignore. Anything that is irrelevant for answering the research question are naturally filtered out by the model.

The only way to figure out what is important for the model and what is not is to use models and see what the model tells us based on the parameters we give it.

Mushroom Hunting

This is a research area that we can try and build a model around. We want to establish the best strategy for searching for mushrooms. Do we:

Search in wide sweeps and then narrow this search to small sweeps?
Do we do this because mushrooms occur in clusters?
What does large/small mean?
What is the optimal split of large/small search times?
What is the sensing radius?

There are so many different questions about what the best way to approach mushroom hunting are. We know that they occur in clusters, and that due to our limited sensing radius, we will have to move at some point.

Previous research has looked at albatrosses, and how they forage for food. It was found that the albatross searched in a wide sweep, interspersed with searches in more focused areas as it went.

Modelling

A question asking us to model mushroom hunting phrased in the following way is quite nonspecific: "Please model mushroom hunting in the forest", whereas a question asking "What search strategy maximizes the number of mushrooms found per unit of time?" is much clearer, and allows us to model just the mushrooms as a cluster of items and a person searching for mushrooms as a point which has some 'sensing radius'.

We can then formulate a model based on clusters of items which are distributed in a 2D world, a mushroom hunter tracking the number of items found and time since last found. The hunter can then switch from large sweeps to small ones if it finds an item, then switch back if we don't find another item within some threshold.

We can model with agent-based modelling to identify the optimal threshold and the best sweep pattern to maximize the number of mushrooms found.

The Modelling Cycle

We cannot usually know where it is appropriate to simplify the model from the outset, as modelling is usually a cyclical activity. We start with a first model, with a preliminary understanding of the system, based on theory, empirical knowledge or other similar modelling idea.

We can then test the usefulness/appropriateness of assumptions for the model through simulations, and seeing if the model seems to represent the data correctly.

The first model in this instance is usually inappropriate to model the system. It is usually too simple, and lacks detail. Therefore, we have to go back to the assumptions.

If a model seems right, it may still be too complex, and we might have to change the assumptions to get the minimal model that will work for the system. A minimal model captures only the essentials of the system required to answer the question.

graph
A(Formulate Question) --> B(Assemble Hypotheses) --> C(Choose Model Structure) --> D(Implement the Model) --> E(Analyze the Model) --> F(Extract Patterns) --> G(Communicate the Model)
F --> A

Formulating the Question

This forms the scope for the model, and what the primary filters should be when considering the parameters of the model. This is not something that should be overlooked.

The initial question might not be clear enough, could be too simple, or could be too complex. For the mushroom hunt example, the question could be refined: "What search strategy maximizes the rate of finding objects if they are distributed in clusters?"

We then assemble hypotheses for the essential processes and structures we are investigating. These look at the factors that have a strong influence on the phenomena of interest, and if they are independent or affected by other factors.

We can then brainstorm and simplify this into something less complicated. It is better to start simple, then gradually make the model more complex. E.g., for mushroom hunting, the essential process would be switching between the small and large scale search.

Variables

Things that affect the model whose behaviour the model is not designed to study are called exogeneous variables. Those that the model is designed to study are called endogenous variables.

The assumptions of the model are the combination of the definitions of the variables and their interrelations.

We need to consider the model in detail and try to produce a written formulation of the model in terms of the equations, rules and algorithms the model follows.

For a mushroom hint, the space might be represented by a square grid, or it might be done in continuous space. Within the model, we have objects, such as the hunter agent and the item agents.

The hunter has some state variables which characterize the hunter. These include the number of items found, the time since the last find, the time that the hunter has hunted. We also document how exactly the hunter searches.

Hopefully in the process of formalizing the model, we can give it in terms of mathematical formulation. This then allows us to formulate the model precisely. Mathematical representation of a model is also well defined, and thus gives us a transparent language that we can use. We can also build on different theorems.

Mathematical representation is also easy enough to convert into a computer program that can compute the results for us.

Implementing the Model

Once the model has been described mathematically, we can convert it into a program that will run on a computer, which animates the model. When running the program, the model gets its own dynamics, or life which stores the state and allows us to run through different iterations of the model.

Using this, we can then explore the consequences of our assumptions for the model. To do this, we make use of analytical techniques and simplifications, including some numerical methods.

Analysing/Testing the Model

Once implemented, we then want to verify the model. We can compare the model to the real-world patterns, and under what conditions the model simulates those patterns.

We also check the boundary cases are reproduced correctly. We check other general insights and heuristics to see if the model is behaving correctly or wrong. Based on our analysis, we might then simplify the model or make it more complex.

Correctness of a Model

If the predictions the model creates are false, then the assumptions for the model are wrong. A model is however an idealisation of the world, and as such no perfect predictions can be expected.

We therefore judge a model by its "robustness". This is where the model derives the same result from a variety of different models of the same situation. If the result is dependant on special assumptions of the underlying model, then it is called "fragile".

Differing Focuses by Discipline

Different disciplines make use of models. Physicists are primarily concerned with how to describe specific situations and make numerical predictions. They favour accuracy of the model over realism. Social sciences are just happy that something will increase or decay over time. They sacrifice the accuracy of the model in favour of realism and generality.

Simulation models favour accuracy and realism over their generality. Levins proposed the triangle similar to cost, speed, and quality, pick 2.

graph
A(Generality) <-->|Physics Models| B(Accuracy);
B <-->|Applied Models| C(Realism);
C <-->|Theoretical Models| A;

Generality is the size of the class of the problem that we want to explain, realism is the amount of detail and how it matches the real world, whilst accuracy is the error in the model.

Why Simulate?

Simulation is a natural extension of analytical models. It is used when doing numerical solving methods, in fluid dynamics, Monte Carlo methods, etc.

They are placeholders to formalize/explore/push theory where strict mathematical frameworks for doing so are somewhat cumbersome and not insightful.