================================= WEEK 9 ==================================== ============================================================================ Simulation and Modelling ============================================================================ * Outline: -- Modelling and simulating . Deterministic versus stochastic simulation . Static versus dynamic simulation -- Constructing simulation models . Components of a simulation model . Example -- Pseudo-random number generation . Generating pseudo-numbers . Etc Model and simulation -------------------- * A "model" is an abstract representation of a real system. Some models can be solved analytically using mathematical methods (e.g., figuring out the time for an object in free-fall to hit the ground), but most "real-life" models cannot (e.g., there is no "equation" for the stock market). * When a model does not have an exact solution, we can simulate it to get information: to simulate a system, we need to model its components. We need to model "entities" in the system, relationships between those entities, and "events" (changes in the state of the system). * Simulations come in many different flavours... The common ones are: discrete (entities can only have one of finitely many values) vs. continuous (entities can have one of infinitely many values, conceptually); static (time is not a part of the system) vs. dynamic (time *is* a part of the system); and deterministic (no event occurs at random) vs. probabilistic (or "stochastic") (some events occur at random). Here are some examples of various simulations in these different categories. - Performing "randomized integration": could be discrete or continuous, static, and probabilistic. (Maybe faster, or more accurate, than other methods.) - Simulating the trajectory of a projectile: continuous, dynamic, deterministic. (May be easier than solving all equations analytically.) * For the rest of the course, we're going to concentrate on dynamic, probabilistic, discrete simulations. For such a simulation, we must have some way of keeping track of the simulation time, and of generating and executing various events as time progresses. There are two basic ways of doing this. Time-driven simulation ---------------------- * In a "time-driven" simulation, we have a variable that holds the current time, and we increment that time by some fixed amount at every step of the simulation. After each time step, we check all possible event types to determine if an event of that type happens at the current time, and handle it if it does. * Example: simulating the trajectory of a projectile. At time zero, the projectile is given an initial position and velocity; as time progresses (incremented by a fixed amount each time), we compute the new velocity and position using the forces acting on the projectile. A time-driven simulation is ideal in this case because we know that an event (movement) happens at every step. * Stopping condition? Either stop when time reaches a specific value (allowing us to answer questions like "how far can we get in that much time", for example) or when a specific state is reached in the system (allowing us to answer questions like "how long would it take to get as far as this", for example). * What would the algorithm for an event-driven simulation generally look like (supposing we're given startTime, endTime, timeStep)? initialize the state of the system let time := startTime; while (time < endTime) [ or while (state != finalState) ] collect statistics from the current state; handle all events that occured in the interval [time, time+timeStep]; let time := time + timeStep; end while Event-driven simulation ----------------------- * When the events are not garanteed to occur at regular intervals in our simulation, and we do not have a good bound on a time increment that is appropriate (i.e., that is not too small so that the simulation takes a long time and not too big so that we have lots of events to deal with at each time step), then it is more appropriate to use an "event-driven" simulation. (A typical example of this is simulating a line of customers at a bank.) * In an event-driven simulation, we have a list of all events that occur at various times. The main loop simply "jumps" to the time for the first event and handles it, then jumps to the second one, and so on. * Now, how does such a simulation know when to stop? Again, we can stop once time reaches a specific point, or once the system reaches a specific state. * So, what does the algorithm for an event-driven simulation look like, conceptually? initialize the system state; initialize the event list; while (the simulation is not finished) collect statistics from the current state remove the first event from the event list handle the event set simulation time = time of this event end while * You can find many more details in the course readings, explained from a slightly different point of view. Discussion ---------- * There is one main issue we have not discussed regarding event-driven simulation algorithms: how do we keep track of the list of events? Do we generate *all* the events for the entire simulation at the beginning and then start processing? Generally, this is not what is done. Instead, we use a "trick": certain events are responsible for scheduling future events when they happen. This means that we have to keep track of the events in an ordered list, ordered by increasing time (so that when new events are scheduled during the simulation, they can be inserted at the right point in the list). At the start, we will simply generate a fixed number of starting events randomly, and from this point on, the rest of the events will be scheduled during the simulation. (You can see this is what is done in the starter code.) * Now, how does such a simulation know when to stop? Again, we can stop once time reaches a specific point, or once the system reaches a specific state. But sometimes, we might want the stopping condition itself to be random. Here again, there is a standard "trick" that is used: we can schedule a "pseudo-event" for termination. It's called a pseudo-event because it's not a change in the state of the system being simulated: it's only something used by the simulation algorithm itself that, in this case, indicates the simulation should stop. * In the same way, when we want to collect statistics about the state of the system, we can collect them after every single event, or we can schedule pseudo-events that indicate to the simulation that it should collect statistics. (This is *not* something that you have to do.) * In fact, if the initialization of the event-list consists only of scheduling a pseudo-event "beginning of simulation", then we don't even need to initialize the system state explicitly because this can be taken care of by the "beginning of simulation" event. Example: single-server queueing system -------------------------------------- * System: A single-chair barbershop. When the shop opens in the morning, customers start arriving at random times. If the barber is not busy when a customer comes in, he/she gets served immediately. If the barber is busy, then the customer has to wait for his/her turn, in FIFO order. The time to serve each customer is also random. * Entities in the system (and their attributes): - server (idle or busy status) - customer (arrival and service times) * Events in the system: - customer arrival - customer departure * Note that since we already keep track of each customer's arrival and departure in a separate event, each one of which knows the time of its occurrence, we do *not* need to keep track of customers explicitly. * System state: - simulation time (initial value: 0) - event list (initial value: empty) - server status (initial value: idle) - customer queue (initial value: empty) The last two are more specifically part of the system we are simulating, while the first two are part of the simulation program. This is very important: there is always only *one* event list, because it represents the flow of time for the entire simulation, and all events must be represented there. There can be any number of waiting lists, because they are just part of the system being simulated, which can change depending on the situation we are modelling. * Initialization: - simply schedule the first customer arrival event * Customer arrival event: - If the server is idle, start service immediately (change server status to busy and schedule an "end of service" event); otherwise, wait in the customer queue. - Schedule the next customer arrival at random so that arrival events have the desired distribution (we'll talk about how to do this later). * Customer departure event: - Change server status to idle. - If the customer queue is not empty, start service on the first customer in the queue (change the server status back to busy and schedule an "end of service" event). * Statistics: - Average delay: for each customer, keep track of the waiting time (even if it is zero), and at the end of the simulation, compute the average waiting time per customer (sum of waiting times divided by number of customers). - Average server utilization: this the fraction of the time when the server was busy. We can simply add up the lengths of time when the server was busy and divide by the total simulation time at the end. Note that this gives an average value, taken over time. - Average number of customers waiting: again, this is an average over time, of the number of customers waiting in the queue. This can be computed as a weighted sum: the average number of customers in the queue is equal to 0 * (fraction of time when there are 0 customers) + 1 * (fraction of time when there is 1 customer) + ... - To compute all of these statistics, we only need to keep track of the following additional quantities during the simulation (the total simulation time is already stored in the system state): the total number of customers for the duration of the simulation, the total time spent waiting for all customers, the total amount of time that the server was busy, and the weighted sum of the number of customers waiting in the queue. - At the end of the simulation, the server will finish working on the current request (if any), but ignore any requests in the waiting list. We will keep gathering statistics about the size of the waiting list but ignore any further statistics about the average delay because we do not know how long the remaining requests would have waited (since we never get to serve them).