Wednesday, May 10, 2017

Improving Performance in Resource-Constrained Systems

I worked on a simulation study last year with Dr. Holt (University of Washington) and Dr. Srinivasan (University of Tennessee). The results of the study surprised me. It made me start thinking differently about variation and its effect on systems. It might change the way you look at bottlenecks in a resource-constrained system as well.

We modeled a multi-project system, but the results we found can be applied to any system. This multi-project system (think of the projects as engineering projects) required various resources at different stages. We modeled a variety of project structures. The projects we modeled used several common resources. The primary output we studied was the project flow time, or the time it takes a project to complete the system.

It is very difficult to correctly determine the appropriate workload that should be placed upon resources in this environment. There is no question that putting too little work into the system will tend to starve key resources. And while there is pressure to keep resources busy, overloading them usually results in unfavorable outcomes like projects taking too long.

In an ideal setting, work schedules can be developed in advance, so that resources have just the right amount of work allocated to them at various points in time. However, in the project world, demand is highly uncertain, workflow is quite unpredictable, and task durations have significant variability. Even the best-planned schedules become difficult to execute in this environment. And when many different resources are used multiple times in a single project and frequently shared between projects, any unexpected delay in a single task can cause significant ripple effects delaying one or more projects. Even a small delay in a task far away from a key resource can cause chaos in the complicated and interrelated schedules that exist in a project environment, and attempts to tightly schedule projects are soon abandoned.

Our study outlined several steps to dramatically improve the performance of these organizations and I want to talk about two of them here. 1. Determine how resources should be loaded 2. Identify the appropriate level of reserve resources

The first step is not a new concept. This is basically controlling the amount of work in the system, and there are several approaches to implement this. In manufacturing, you might refer to it as CONWIP or Constant Work in Process. In the project-management environment, the term is CONPIP or Constant Projects in Process. We applied a slightly different mechanism, but it had a similar effect as CONWIP or CONPIP. In our system, we monitored the backlog of work for the resources. This backlog would generally not be completely present in the immediate queue of work at that resource. We would only release new work into the system once the bottleneck resource was below a specified threshold.

The chart below shows the first set of results from the resource loading study. The X-Axis shows the resource workload. For example, at 100 percent, the bottleneck resource has enough work in the system to keep it busy for 36 days. At 200 percent, the bottleneck resource has enough work in the system to keep it busy for 72 days. The blue line shows the effect that an increased workload has on the average flow time. As more work is pushed into the system, the average flow time increases. Increased flow time means it takes longer for projects to complete, so customers are much less happy! Longer flow times can be detrimental to a company. The black line is plotting the throughput. With an increased workload in the system, more projects can be completed, but at a certain point, the increase is negligible.

The red line is something we called the project value index. This is defined as the number of projects completed over a given period divided by the 90-percent probable flow time. The project value index is a value we want to maximize (more projects completed while decreasing flow time). We tend to want to be just a bit to the right of the high point on the project value index. This is a good balance of throughput.



The next issue we studied was the use of additional resources. The results of this study are what really surprised me. The typical thought process for improving a system is to add resources to the bottleneck. Then to keep improving the system, you would find the next bottleneck to add resources to. This feels like the natural progression for improving projects, right? In the system studied, we did have a clear bottleneck. We had nine resources. When the bottleneck resource was at 100 percent utilization, the other resources ranged from 50 percent to 75 percent.

Another strategy we tried was to use an Expert resource. This is a resource that can be used to help any other resource, not just the bottleneck. This would be the most experienced staff member that can do everything. We didn’t want this expert resource working just at the bottleneck resource. The task durations were all random. We let the expert resource help any resource when the task was taking longer than the expected value. These expert resources would ONLY be requested for help after the task duration had exceeded the expected mean value. For example, let’s say the expected task duration was six days. If the task was not complete by day six, then the expert resource would be requested to help complete the task to “shorten the long tail” of the task duration. The expert resource is specifically used to reduce the long tail on the right side of the service time distribution.

In the chart below, we used the Project Value Index to compare the two strategies of a) adding an Expert resource, which helps reduce the long task times and b) adding a resource at the bottleneck. As you can see, using the Expert resource had a significantly better impact! Wow. I did not expect this.



 Here is what I learned from this study: When a task consumes a resource for an excessive amount of time, it not only delays this project from completing but it also delays every project in the queue for this resource. So long-tail tasks have an impact on potentially all the projects in the system, not just on the individual project. Focusing on these long-tail tasks, even on non-bottleneck processes, has a bigger impact on the system than just focusing on improving the bottleneck process.

That is something you should noodle on. This concept can of course be applied not only to project-management systems but also to many other resource-constrained systems.

Popular Posts