Wednesday, May 10, 2017

Improving Performance in Resource-Constrained Systems

I worked on a simulation study last year with Dr. Holt (University of Washington) and Dr. Srinivasan (University of Tennessee). The results of the study surprised me. It made me start thinking differently about variation and its effect on systems. It might change the way you look at bottlenecks in a resource-constrained system as well.

We modeled a multi-project system, but the results we found can be applied to any system. This multi-project system (think of the projects as engineering projects) required various resources at different stages. We modeled a variety of project structures. The projects we modeled used several common resources. The primary output we studied was the project flow time, or the time it takes a project to complete the system.

It is very difficult to correctly determine the appropriate workload that should be placed upon resources in this environment. There is no question that putting too little work into the system will tend to starve key resources. And while there is pressure to keep resources busy, overloading them usually results in unfavorable outcomes like projects taking too long.

In an ideal setting, work schedules can be developed in advance, so that resources have just the right amount of work allocated to them at various points in time. However, in the project world, demand is highly uncertain, workflow is quite unpredictable, and task durations have significant variability. Even the best-planned schedules become difficult to execute in this environment. And when many different resources are used multiple times in a single project and frequently shared between projects, any unexpected delay in a single task can cause significant ripple effects delaying one or more projects. Even a small delay in a task far away from a key resource can cause chaos in the complicated and interrelated schedules that exist in a project environment, and attempts to tightly schedule projects are soon abandoned.

Our study outlined several steps to dramatically improve the performance of these organizations and I want to talk about two of them here. 1. Determine how resources should be loaded 2. Identify the appropriate level of reserve resources

The first step is not a new concept. This is basically controlling the amount of work in the system, and there are several approaches to implement this. In manufacturing, you might refer to it as CONWIP or Constant Work in Process. In the project-management environment, the term is CONPIP or Constant Projects in Process. We applied a slightly different mechanism, but it had a similar effect as CONWIP or CONPIP. In our system, we monitored the backlog of work for the resources. This backlog would generally not be completely present in the immediate queue of work at that resource. We would only release new work into the system once the bottleneck resource was below a specified threshold.

The chart below shows the first set of results from the resource loading study. The X-Axis shows the resource workload. For example, at 100 percent, the bottleneck resource has enough work in the system to keep it busy for 36 days. At 200 percent, the bottleneck resource has enough work in the system to keep it busy for 72 days. The blue line shows the effect that an increased workload has on the average flow time. As more work is pushed into the system, the average flow time increases. Increased flow time means it takes longer for projects to complete, so customers are much less happy! Longer flow times can be detrimental to a company. The black line is plotting the throughput. With an increased workload in the system, more projects can be completed, but at a certain point, the increase is negligible.

The red line is something we called the project value index. This is defined as the number of projects completed over a given period divided by the 90-percent probable flow time. The project value index is a value we want to maximize (more projects completed while decreasing flow time). We tend to want to be just a bit to the right of the high point on the project value index. This is a good balance of throughput.



The next issue we studied was the use of additional resources. The results of this study are what really surprised me. The typical thought process for improving a system is to add resources to the bottleneck. Then to keep improving the system, you would find the next bottleneck to add resources to. This feels like the natural progression for improving projects, right? In the system studied, we did have a clear bottleneck. We had nine resources. When the bottleneck resource was at 100 percent utilization, the other resources ranged from 50 percent to 75 percent.

Another strategy we tried was to use an Expert resource. This is a resource that can be used to help any other resource, not just the bottleneck. This would be the most experienced staff member that can do everything. We didn’t want this expert resource working just at the bottleneck resource. The task durations were all random. We let the expert resource help any resource when the task was taking longer than the expected value. These expert resources would ONLY be requested for help after the task duration had exceeded the expected mean value. For example, let’s say the expected task duration was six days. If the task was not complete by day six, then the expert resource would be requested to help complete the task to “shorten the long tail” of the task duration. The expert resource is specifically used to reduce the long tail on the right side of the service time distribution.

In the chart below, we used the Project Value Index to compare the two strategies of a) adding an Expert resource, which helps reduce the long task times and b) adding a resource at the bottleneck. As you can see, using the Expert resource had a significantly better impact! Wow. I did not expect this.



 Here is what I learned from this study: When a task consumes a resource for an excessive amount of time, it not only delays this project from completing but it also delays every project in the queue for this resource. So long-tail tasks have an impact on potentially all the projects in the system, not just on the individual project. Focusing on these long-tail tasks, even on non-bottleneck processes, has a bigger impact on the system than just focusing on improving the bottleneck process.

That is something you should noodle on. This concept can of course be applied not only to project-management systems but also to many other resource-constrained systems.

Thursday, March 30, 2017

Let’s Test That Idea



The other day, I was talking to a fellow ExtendSim model developer, Aaron Kim from JWA Consulting. Aaron, who is a Lean consultant in the health care industry, uses ExtendSim as part of his Lean toolbox.  

Aaron described one of his recent simulation models, and it got me thinking not only about how underutilized simulation is. Why are there not more models built that simply compare concepts at a high level?  

Many of you who have built models know how easy it is to A) include too much detail or B) include processes around the fringe of the problem. Doing either requires extra effort to model and can cause delays to an entire project. I already suspected these were two root causes of unsuccessful projects, but could they also be the two main reasons simulation is not used as much as it should be?  

When Aaron described his model, I thought it was a perfect example of how valuable a simple simulation model can be. Aaron built a model that compared two scheduling strategies. He stayed out of the weeds, so to speak, and simply looked at the concepts involved.     

Aaron was working with a clinic. The clinic classified their patient visits into two basic categories – Short visits and Long visits. A Short visit would take about 20 minutes, while a Long visit would take about 40 minutes. Generally, the Long visits were new patients and accounted for roughly 25 percent of all visits.

The clinic had been scheduling patients according to what they called a “template” schedule.  The template schedule method works by setting up a template of appointment times for both patient visit types. When a patient requests a specific time, the clinic gives him or her the closest appointment block designated for that type of visit.

For example, if a Long patient called in and requested an 8:10 a.m. appointment, that slot could be open for a Short visit but not for a Long one. In such cases, the clinic would then give the patient the closest appointment time slotted for a Long patient, which might be an hour or two later. The clinic felt that their open appointment scheduling was better, since it gave patients appointments closer to their desired times.

An executive at the clinic suggested to Aaron that they switch to a “Open” schedule because they thought it seemed more patient-centric.

The open schedule method works by giving patients the available time closest to their desired appointment time. For example, if a patient wants a 9:00 a.m. appointment, and that slot is open, then the clinic gives it to the patient, even if it causes gaps in the schedule.

Aaron felt like the open scheduling method would leave gaps that were too small to see other patients and therefore result in the clinic scheduling fewer patients overall. Because of that, Aaron felt the template method would provide better patient satisfaction, as calculated by averaging the difference between the desired appointment times and the given appointment times.

Aaron decided to build a simple model to compare the two scheduling methods. He didn’t want an elaborate model with all the grueling details but rather something simple, just to compare the two methods, to see which one would give the better performance.  

Rather than modeling all the doctors in the clinic, Aaron chose to model just the scheduling of a single room with a single provider. He also did not model how each doctor worked different hours during the week nor how each took his or her lunch break at different times of the day nor how some preferred to come in late on Mondays or golf on Wednesday afternoons or take Friday afternoons off. Those were important details, but Aaron was not trying to model the entire clinic; rather, he simply wanted to see the difference between the two scheduling strategies.  

Aaron’s model had a specified number of patients per day wanting to book appointments for times over the following two weeks. Each patient would be booked on both an open schedule and a template schedule. The key performance measure of the system was the average time difference between the desired appointment times and the given appointment times. The results are shown below.

The Patients Per Day was a variable that varied from 16 patients per day to 20. The results showed that the more patients scheduled per day, the better the template schedule outperformed the open schedule. 

Because Aaron was just trying to compare two scheduling policies, this turned out to be a quick modeling project. It took less than eight hours to build the model and analyze the results.  

The time spent building simple models like this one can pay off immensely. But I hear far too many stories in which models take months to get data and build and far too few in which models are built quickly just to answer simple questions like this one. The challenge for us all is to know the correct level of detail needed to answer the primary question. So the next time you have a problem that a simulation model could be used to answer, don’t be afraid to build the model, but please pay attention to the level of detail required. It will take far less time to build if you can leave out the unnecessary details, and it could make simulation a much more useful tool for you.

Popular Posts