Monday, December 26, 2011

Teaching Notes: Simulation Fall 2011 and using Simpy and Sage

This was my first semester teaching a graduate level (research focus) simulation course. The department has not had this course in quite some time. There is a master's level graduate course taught by an adjunct professor that focuses on simulation modeling (i.e. building the models, with an explicit de-emphasis on analysis). This course, in stark contrast was to focus on the analytical side of simulation with a de-emphasis on model building (i.e. the models used would be considerably simpler then would be expected in the other course, including the project).

The other goal was to learn a new simulation library. I wanted to learn to use Simpy simulation library as it is used by researchers associated with a computational research center at my school. I was using this within Sage, a mathematical programming environment. Sage in its notebook mode was how I was going to present the material, as it allows for mixing formated text as well as showing the results of calculations, graphing of results, etc. I wanted to test live generating of graphs and output of random simulations, to demonstrate the effects of randomness throughout. I let each student choose a simulation platform. The standard here was Arena. The other options were simulation libraries targeted at various simulation languages such as Simpy (Python), SSJ (Java), Simlib (with the Law and Kelton books for C/Fortran) or Omnet++ (C++).

The students were a mix of engineering PhD/MS students and students in the MBA/MS-Industrial Engineering (IE) program. Note that there is some selection here, as everyone is fully aware that the other graduate simulation course would be offered in the spring. Actually, one of the MSIE students had previously taken that course. Most students used Arena. One used Simpy and one used Matlab (i.e. roll his own)

Some notes
  1. Teaching MBA/MSIE students was fun. If this is what teaching MBA students is like, I'm all for it. These students were attentive, frequently asked very insightful questions, eager to learn the material and implement it, and were quite appreciative of the analytical focus of this course. A few of them mentioned that since they were interviewing for jobs, topics covered came up in their interviews (clearly, these were quantitatively oriented MBAs). One issue was during projects, as one of the PhD student projects was based on what he was exploring as a PhD thesis, I had to explicitly state that there were different standards for projects.
  2. An analytically focused simulation course was the right idea. The MBA/MSIE students liked it and appreciated the difference, including the one who took the other simulation course previously. Focusing on the use of simulation instead of the building of simulation models put the emphasis on the use of simulations for decision making (which allowed the MBA students to bring in what they knew from other courses with them). And for the PhD students, implementing analytical methods gave them an understanding of the field. And a decent part of one PhD dissertation is going to come out of the course.
  3. Simpy - I liked using Simpy. I found it fairly easy to pick up once I started putting some time into it. One issue was the general flexibility of programming language simulation libraries compared to commercial packages. My general pattern for solving a homework problem was to (i) take code from a similar problem, (ii) (re)write a class to incorporate the differences, (iii) write data collection code (iv) analyze results. But what happened to those who used Arena was they could not get modules to do what they wanted, and developing data recording procedures for an arbitrary performance measure and getting the per-replication output in Arena could be a daunting task. So what took me 20-30 minutes sometimes took the students hours.
  4. Sage - The Sage notebook view was very useful since it allowed the mix of formatted text (like a Powerpoint slide would have), along with live calculations. I used this along with simulations to demonstrate the effects of random variables and to show how various formulas and algorithms are actually implemented. (the descriptions in books and articles skip implementation details) Having the description alongside implementation made sense. When asked, the students preferred this version over the alternative of me drawing on the board (which I also did on occasion) and definitely better then Slides (with the benefit that slides give of having distributable lecture notes). One other benefit was I had to make sure I understood everything, because I would implement every procedure discussed in code including charting before giving the lecture, since the implementation was part of the lecture. There was a similar downside of the students not being able to efficiently replicate the analysis, as they were mostly using Excel spreadsheets for analysis and it was sometimes time-consuming to do tasks that programming made quick.
  5. Sage data analysis. Sage uses the Matplotlib library for data display and graphics. It is reasonable capable, and I have more flexibility than say R. But the tighter integration with the data analysis techniques already built into R make R a better platform when it comes down to it. Sage/Matplotlib has the advantage that modeling can be done in Sage/Python, allowing for all in one tool. (R can be accessible from within Sage, but it is not straightforward once you get past the R core functions)

Conclusions

  1. An analytically focused course works, even with non-PhD students (who were admittedly self-selected)
  2. The Sage notebook view is useful for teaching purposes. Formatted text, LaTeX for equations. Sage's ability for typesetting symbolic math and putting descriptions as well as implementation side by side was useful. Especially in stochastic settings where people do not have well developed intuition on the effects of stochasticity. I like this as a teaching environment. Unfortunately, this seems to be difficulty to teach people how to install it so I'm on my own here :-(
  3. One issue is having people in the same class using commercial simulation packages and programming languages. It is very easy to create a problem that is unexpectedly difficult in a commercial simulation package (and I have no reason to believe that it is a particular failing of Arena). The lack of flexibility in modeling, data collection and analysis makes it easy to get a student in trouble. And I am convinced that this occurs in practice once you leave the core domain the packages were designed for. (I get direct personal contact with representatives from the companies behind a couple of the packages so I get to have this discussion directly with them.)

No comments: