The Project Management Soap Box

Featured book: Robust Project Design(tm) - All the things that you were never taught about modeling projects.

Saturday, December 18, 2004

[22] What's The Problem, Really? (Conclusion, part 3)

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues. Subscribe!]



In the previous article it became evident that multitasking had zero effect on mean resource utilization and on mean flow time. These functions, it seems are entirely under the indirect control of the executive of an enterprise, who decides the rate at which projects are launched and by so doing indirectly sets the mean rate at which tasks arrive at all resources, even our at generic resource. So, what does multitasking do?

The answer is made clear by the next figure, which shows the mean process time of tasks as a function of the load ratio and the multitasking policy level. The white circles in that figure denote that multitasking is banned. The yellow circles denote a multitasking policy of two tasks simultaneously, presumably from two different projects. The gold circles denote a multitasking policy of four tasks simultaneously. The red circles denote a multitasking policy of ten tasks simultaneously at every opportunity.

It is evident that multitasking acts as a multiplier of the process time of tasks. This isn't so stunning a revelation. A human being is a finite capacity device. The duration of a task that receives only a fraction of the capacity of a resource is inversely proportional to the level of capacity that the task actually receives. But the figure does yield a surprise. Specifically, the magnitude of the multitasking multiplier is itself a function of the load ratio. At low values of the load ratio, multitasking has almost zero effect. As the load ratio approaches a value of 1, the magnitude of the multitasking effect approaches its maximum value. This means that the effect of multitasking is actually an interaction effect.

This interaction, between multitasking and the load ratio, provides the most astute executives and managers with a unique opportunity to limit the damaging effects of multitasking. Per the robust design strategy defined by Dr. Genichi Taguchi decades ago, an informed management team can minimize the damaging effects of multitasking and not compromise throughput simply by maintaining the organization's actual load ratio at approximately 70%.



The next figure shows, both, the flow time curve (dark blue squares) and the same process time curves of the previous figure. Recall. Flow time is a function exclusively of the load ratio. Therefore, so long as the load ratio is held constant, the resulting mean flow time is also a constant.

With that said, consider the case where the load ratio is held constant at 94%. The flow time data for our generic resource shows that when the load ratio is constant at 94%, the mean flow time is just shy of 500% of planned task duration. This is true regardless of the multitasking policy level.

The process time data in that same figure shows that as the multitasking policy level increases, the process time becomes an increasingly larger portion of the overall flow time. But the queue time becomes correspondingly smaller. In other words, multitasking is letting us exchange time in process for time in queue. The mean flow time is constant, no matter what the multitasking policy level happens to be. The process time and the queue time see-saw back and forth as the multitasking policy level changes. With zero multitasking, the process time is at a minimum, and the queue time is at its maximum. As the degree of multitasking increases, process time becomes longer, and queue time becomes correspondingly shorter. But the flow time is unchanged.

So, what is multitasking really doing? Multitasking is creating only the illusion of progress, for executives. Multitasking lets every resource manager look any executive straight in the eye and say with a straight face, "We're working on your project, Mr. executive. It's making progress." What's never said is the rest of that thought: "Your project is making progress at the pace of a sick, pregnant snail."

The executive, in turn, sleeps soundly at night, comforted by the knowledge that he is really getting his money's worth out of those expensive developers, scientists, and engineers. After all, the monthly report from the CFO shows that the mean utilization level of the development staff is well above 100%. He's getting free overtime. Surely his enterprise is operating at peak productivity. Isn't it?

No, the executive's enterprise is not operating at peak productivity. Quite the contrary is true. Multitasking was costing Confluence the equivalent revenue of five project per year. When the dilution of capacity across too many active projects and multitasking defined the management paradigm of Confluence, the company's development resources were completing only six projects per year, on average. After the implementation of enterprise project management, EPM, those same resources were able to complete an average of eleven projects per year. The revenue from the five additional projects is all gravy, so to speak, and it cost virtually nothing to generate it.



The real decision faced by today's executives is illustrated by the next figure. The triangles in that figure denote the mean resource utilization. The blue squares denote the mean flow time of tasks. Both, utilization and flow time, are functions exclusively of the load ratio, which is controlled entirely by the executives of every enterprise. The figure clearly shows that these two measurements are in total conflict with each other. Policies and measurements designed to maximize utilization do exactly that. But they maximize utilization at the expense of flow time and speed to market. More importantly, as the Confluence case history and numerous others show unequivocally, they maximize utilization to the great detriment of real productivity and shareholder value.

The luxury of keeping people busy is costing product development organizations and IT organizations most dearly.



So what is the real problem? Finally we can answer the question posed in the titles of this and the previous two articles. The real problem is that we continue to design organizations in such a way as to optimize the wrong measurement, resource utilization. The policies, measurements, and executive practices of our enterprises are all based upon the false premise that keeping everybody busy automatically results in maximum shareholder value. In truth, keeping everybody busy results in a great destruction of shareholder value. I can say with great confidence that today at least 50% of the development payroll of nearly every product development organization is being wasted. But this is only the directly measurable waste. Much more than this is squandered worldwide, in the form of opportunity, revenue, and profits that might have been seized but never are. The great irony of it all is that our very efforts to save a few dollars are costing us millions upon millions more in lost earnings.

Is there a way out of this quagmire? Indeed there is. Confluence and others have given us indisputable proof that a useful, powerful solution is available to us. We begin to construct that solution step by step in the coming articles, as we design an Enterprise Project Management (EPM) system for product development and IT organizations. Stay tuned. These are exciting times.

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues. Subscribe!]



Thursday, November 25, 2004

[21] What's The Problem, Really?! continued

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues. Subscribe!]



To understand why multitasking exists, it is necessary to look at a multiproject operation through the eyes of a single knowledge worker. From the perspective of a single worker, a multiproject organization looks like an amorphous entity that generates tasks at random, pushes them to the worker, and requires the worker to complete them all as soon as possible. The worker, in addition, is required to abide by a number of policies that are imposed by the management of the organization.



Two levels of management typically define the policy environment within which a knowledge worker must exist. These are the worker's immediate supervisor, also known as the resource manager, and the executive in charge of the greater organization. The resource manager, to whom the knowledge worker reports, determines the degree of multitasking that the worker is required to perform. The executive indirectly determines the rate at which the greater organization spawns urgent, project-related tasks for the worker to perform. The executive, in other words, indirectly determines the workload for which the knowledge worker is held responsible.

To understand the interaction between the knowledge worker and his immediate environment, we use the computer model illustrated in the next figure. The organization is represented by the cloud at the left of the illustration. It is the source of urgent, project-related tasks that are pushed to the worker's in-box.

The tasks generated by the organization are of various sizes. The distribution of task sizes is skewed to the left (long tail to the right). Further, the tasks are not generated in any predictable pattern. Rather, they are generated and pushed to the worker at random. The distribution of inter-arrival intervals used in this model is very nearly uniform, for reasons that are beyond the scope of this discussion.

Finally, our knowledge worker is represented by a third distribution. This one is skewed to the right (it has a long tail to the left). The random numbers generated from this distribution represent the hours of effective effort that the worker provides during each day of operation. The hours of effective effort are applied to the active tasks before the worker.

The mean of the worker output distribution is precisely eight hours, implying that the worker is capable of delivering eight hours of effective effort, but only on average. The minimum of the worker output distribution is zero, and the maximum is thirteen. A worker output of zero hours of effective effort for a particular day of operation implies that for that day the worker was absent or at least unproductive. The maximum value of thirteen implies that the worker had an unusually productive day.



Next, we have the policy variables. Two policy variables are important to our model. The first of these is controlled by the executive, and we refer to it as the executive's policy variable. The executive's policy variable has two settings. The first setting is usually stated as: "If we have idle resources, then we launch more projects." In other words, the executive's policy variable really controls resource utilization throughout the organization. Further, this particularly ubiquitous setting creates a push system within organizations.

The second setting of the executive's policy variable can be stated as: "We launch new projects only as fast as we can complete projects." This is the policy that ultimately creates a pull system within organizations.

The executive's policy variable is represented within our model by the load ratio (see the next figure). This is, in essence, the ratio of two flow rates: the flow rate of tasks coming to the resource and the flow rate of tasks that leave the resource after being completed. Within a real organization, the flow rate of tasks coming to any resource is proportional to the rate at which new projects are launched. Thus, by ordering projects to be launched faster, the executive in charge of a development organization or an IT organization is increasing the rate at which tasks arrive at all resources.

Further, the denominator of the load ratio is a constant, for a very practical reason. A person is a constant throughput device, on average. Therefore, by ordering projects to be launched at a faster pace, the executive is indirectly causing the load ratio of all resources to increase. In this model we control the load ratio directly, with the understanding that an increase in the load ratio corresponds to an increase in the rate at which a real organization launches new projects.



The other independent variable is the multitasking policy setting. This is controlled by the resource manager directly, who requires our generic developer to begin new tasks before his current task is completed. For the purposes of this discussion, the multitasking policy variable can be set to any integer value between 1 and 10, inclusive. A value of 1 indicates that multitasking is actually banned, since it specifies that the resource may work only one task at a time. The maximum number of tasks that any individual can work simultaneously varies from person to person, of course. And it is highly unlikely that anyone can work 10 tasks simultaneously. Thus, a multitasking policy setting of 10, which specifies that the resource must work a maximum of 10 tasks simultaneously at every possible opportunity, is used to model the case of nearly unlimited multitasking.

The model just described is used to determine the effects of the two policy variables on three measurements: mean resource utilization, mean task flow time, and mean task process time. These are defined as follows:

1) Mean resource utilization - the percentage of available work hours that are spent on project work. A 40-hour week corresponds to 100%.

2) Mean task flow time - this is defined as the average interval from task arrival to task completion. It is expressed as a percentage of planned task duration.

3) Mean task process time - this is defined as the average interval from task start to task completion. It is expressed as a percentage of planned task duration.

A fourth measurement, queue time, is implied by the flow time and process time measurements. Queue time is defined as the interval from task arrival to task start. Thus, it is the time that a task spends in the in-box of the resource, before the resource begins working it. It is evident that the sum of queue time and process time always equals flow time.

Now that we understand the model, we can begin using it.

It is vitally important to understand the interaction between the two levels of management, which this model helps to clarify. We begin by observing that most executives regularly receive reports from the financial arm of their enterprises, a.k.a. the CFO's organization. All too often, one of these reports includes a measurement of the mean utilization of the resources of the organization. Should such a report indicate a level of utilization that is measurably less than 100%, the executive in charge inevitably concludes that much of the expensive capacity made available by his painful payroll is unutilized by the organization's projects and therefore being wasted. A loaded question comes to mind now. How long does it take most executives to respond to such a measurement?

The response time rivals the time that it takes information to travel from the executive's eyeballs to his brain. The nature of that response depends on the economic outlook and on the expected demand for the company's products, particularly the products already in the development pipeline. If the economic outlook is bright, and if the anticipated demand for the products under development is high, then the executive nearly always responds to the low utilization report with a mandate to launch more projects. After all, it is only reasonable. The report indicates that many of those expensive developers don't have enough project work to do.

Unfortunately, the decision to launch more projects quite often fails to take into account the real reasons for the low utilization measurement. Frequently developers are required to undertake a number of activities that have nothing to do with the projects of the organization and, consequently, may not show up on the CFO's utilization measurement. Such activities include training, paperwork for any number of initiatives launched by corporate, sales presentations, bug fixes for products already on the market, and more. Still, the utilization measurement is low, and the executive usually responds by pushing more projects into the organization.

The increase in the rate at which new projects are launched inevitably creates a queue of tasks in the inbox of our generic resource, our any-resource. The queue would not exist, were it not for variation. But, variation does exist. Variation exists in the arrival rate of tasks, in the size of tasks, and in the rate at which the resource completes tasks. Variation is everywhere, and it is most significant. Therefore, once the release rate of new projects is increased, it doesn't take long for a very visible queue of unstarted tasks to form in the in-box of our resource and in the in-box of every other resource as well.

Once there exists a queue of tasks for the resource, the fun begins. At this point, I'd like you to pretend that you are the supervisor of our resource. You are the resource manager. Your expensive developer is working diligently on one single task, while three or four other tasks simply sit in his in-box, untouched. Might you feel somewhat uncomfortable with this situation? Indeed you might, if you were employed by any of the vast majority of product development organizations or IT organizations of today. The four tasks in the in-box indicate that there could be as many as four different executives stalking you, each of them expecting an explanation as to why his pet project has seen zero progress for some time.

What are your options as resource manager? Well, you could simply tell the executives to take a hike. But that would be a career-limiting move for you. Given your severe need for a paycheck, that's not likely to happen. A second option is for you to tell the resource to start the tasks that have been in his in-box for perhaps a few weeks. This is a much safer move for you, since the resource can do little to curtail your career. Indeed, if he is like most other developers, he wants to do his best to help out. He wants to be a team player. Therefore, the multitasking policy has just been increased from one task at a time to five tasks simultaneously, all in the name of increased productivity.

Now, here's a wakeup call for all of us. We've increased the multitasking policy, from no multitasking to multitasking with five tasks simultaneously (presumably from five different projects). Does this mean that the queue of tasks vanishes, that the in-box of our resource is cleared out?

Indeed, the in-box of our any-resource is cleared out. The queue of tasks does vanish. But it vanishes only for the briefest period. The conditions that created the queue of tasks in the first place continue to exist. The queue was created by the increase in the arrival rate of tasks, an increase that was itself caused by the executive, when he decided to launch new projects at a faster pace. That executive decision hasn't changed. Therefore, the excessively high arrival rate of tasks persists, variation also persists, and the queue of tasks begins to build anew in just a week or two.

Ironically, the decision to use multitasking widely has no impact on resource utilization. This is illustrated by the next figure, which shows the Capacity Utilization of our any-resource, as a function of the Load Ratio and of the multitasking policy level. Recall that the Load Ratio is defined as the ratio of the mean arrival rate of tasks to the mean process rate of tasks. The multitasking policy level is denoted by the colored triangles. White triangles denote that multitasking is banned (one project at a time only). Yellow triangles denote that the resource must multitask with a maximum of two tasks simultaneously, at every opportunity. Gold triangles denote a multitasking policy level of four tasks simultaneously. Red triangles denote a multitasking policy level of ten tasks simultaneously. The figure shows that all the curves are on top of each other, indicating that the multitasking parameter has no impact on resource utilization. Resource utilization is a function exclusively of the Load Ratio, which is controlled by the executive of the organization and not by the resource managers.



Nor does the decision to use multitasking widely have any impact on flow time. Recall. Flow time is defined as the interval from task arrival to task completion. This is illustrated by the next figure, which shows flow time as a function of the Load Ratio and of the multitasking policy level. Flow time is shown as a percentage of the mean planned duration of tasks. The white squares denote that multitasking is banned (one project at a time only). The yellow squares denote a multitasking policy level of two tasks simultaneously, presumably from two different projects. The gold squares denote a multitasking policy level of four tasks simultaneously. Yet again, all the curves are on top of each other, indicating that the multitasking parameter has absolutely no impact on flow time. Flow time is a function exclusively of the Load Ratio, which is controlled entirely by the executive of the organization and not by the resource managers.



So, multitasking has no effect on the utilization level of our resource, and it has no effect on the mean flow time of tasks. This begs the question, "What is multitasking doing for us, or perhaps to us?" The answer to this question will be the focus of the next article.

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues. Subscribe!]


[20] What's The Problem, Really?!

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues.Subscribe!]




If you've read this blog up to this point, then you know that the models of projects, which are being created currently by managers who don't understand variation, fail to take into account variation. Thus, even if the models did exhibit accuracy relative to deliverables and accuracy relative to logistics (which they do not), the models cannot exhibit accuracy relative to duration. This unfortunate fact essentially dooms the respective project managers to failure. But the situation is actually worse than that, if you can believe it. It is much worse.

All the analysis that we discussed during the previous articles was based on two assumptions: (1) resources work each task at a full level of effort, and (2) each project enjoys a fully dedicated team of resources. These assumptions were valid during the early days of formal project management, when Kelly Johnson and his people busily were developing the U-2 and the SR-71 Blackbird, because at that time companies indeed had adopted the single-project model. Today, it is exceedingly difficult to find any company that continues to use the single-project model.

What changed? During the 1970's and 1980's, companies throughout the world began adopting matrix management. They all saw matrix management as a tool with which to increase the productivity of their product development operations. "Do more with less," was the cry of the day. Ironically, when they jumped onto the matrix management bandwagon they created conditions that yielded exactly the opposite of that which they sought to achieve. They all ended up doing much less with a good deal more.

This unfortunate outcome was probably impossible to predict. Even today, the mechanism by which matrix management destroys the real productivity of product development organizations and IT organizations is counterintuitive. The real problem, therefore, is hidden from view and shielded from the spotlights of scrutiny, analysis, and understanding. The solution to this devastating problem is counterintutive to an even greater degree and therefore shielded to that greater degree from the understanding of most managers and executives. We begin by undesrtanding the mechanism by which the productivity of organizations is destroyed, multitasking.

First, we need to define multitasking. Multitasking is a behavior exhibited by working resources, individuals. A multitasking developer contributes to several tasks simultaneously, without focus, without a full level of effort, and with much unnecessary switching from one open task to another. Multitasking is a tremendous source of variation in task duration, project duration, delays, and waste. Here is why.

Let's say that a particular developer has three open tasks and is driven to jump from task to task several times, before any of the three tasks is completed. Let's say, too, that each task is associated with a different project. Task A is part of project P1, task B is part of project P2, and task C is part of project P3. While the developer works task A, project P1 is making progress. But projects P2 and P3 are not. They are being delayed. If the developer puts down task A, rather than finishing task A and launching the work of a downstream resource in project P1, and shifts his focus to task B, then while he works task B projects P1 and P3 are being delayed. Every time that the developer leaves a task without finishing it, that task and the corresponding project are delayed.

Sometimes, the task switching is absolutely necessary. Shift happens, as the saying goes. Emergencies, problems, opportunities present themselves, and the people of our organizations must respond. But such responses do not constitute multitasking. They constitute the necessary, information-driven, real-time reprioritizations of the task queues of a few developers.

Such information-driven changes in the priorities of a few developers also create delays at times. But their number and their schedule impact are at most an order of magnitude smaller than the schedule impact created when most developers multitask most of the time. Further, the information-driven task switching is necessary. The task switching that constitutes multitasking is undesirable, unnecessary, and driven by factors that have nothing to do with creating shareholder value.

So, what causes the task switching that constitutes multitasking? To answer this question, and to understand the full impact of the damage to schedule performance and financial performance, we need to look closely at the interaction between a single developer and the organizational ecosystem created by the management team. We do this with a computer model of a single knowledge worker in a typical product development organization or IT organization.

[continued in the next article]

[So that others might also subscribe to Shareholder Value, please share the following link with friends and colleagues.Subscribe!]

Thursday, November 18, 2004

[19] Multitasking Is Costing Billions



A note: This particular article is being republished. A number of readers helpe me with the contents of a newsletter, recently. The article is the first in the newsletter, and I wanted to share the latest version of it. The message is the same. Multitasking simply causes expensive capacity to be redirected to the creation of more and more work in process, at the expense of throughput. The real question is "Why is multitasking taking place at all?" We'll explore this in subsequent articles.

Perception


Multitasking, a work paradigm that is widely perceived as a productivity enhancer, is actually costing you millions of dollars every year. After eliminating multitasking from their multiproject operations, some companies are now saving those millions, by avoiding massive payroll increases.

Multitasking is a task-level paradigm. A multitasking developer has several tasks active simultaneously. The developer shifts his/her focus from task to task many times before completing any of them. Multitasking developers typically are perceived as being diligent, hard-working, valuable, and very busy, which they most certainly are.

Most managers perceive multitasking as a productivity-enhancing paradigm, since it allows them to start projects that otherwise would have to wait for developers to become available. Consequently, many managers actively encourage their developers to multitask. If your organization performs product development projects or IT projects, then multitasking is probably devastating your performance in these areas right now.

How can you tell if multitasking is widespread in your organization? To find out, just look at the ratio of the number of active projects to the number of developers. If this ratio is greater than about 0.3, then multitasking is probably running rampant throughout your enterprise. If the ratio is approaching 0.5, well, yikes!


Reality

So, why is multitasking such a big problem? The answer to this question is explored best in two steps. First, imagine that you’re a customer at a bank. As you wait in an unpleasantly long line, for your turn with the teller, you notice that the teller is doing something unusual. Rather than completing each customer’s transaction, the teller is beginning the transaction of the second customer and even that of the third customer. Then, she is task-switching, from one transaction to another, without completing any transaction. Before long, the teller has four or five open transactions, with none of them close to being completed. The teller is multitasking.

Would you expect to complete your banking any sooner, given the teller’s multitasking paradigm? Obviously not! In fact, you can probably expect to be delayed further, by the mistakes that the teller’s frequent context switching is sure to cause.

Of course, bank tellers don’t multitask. In such an environment, multitasking is obviously damaging. So, workers simply don’t do it. Nor would bank managers tolerate it for long, even if some workers decided to try it.


Working For WIP

Any bank teller who decides to multitask is making a big mistake. Such a teller is choosing to misdirect capacity, away from generating throughput (in the form of completed transactions) and toward creating more and more work in process, WIP.

Of course, bank tellers and knowledge workers aren’t quite the same things. But the mechanism that damages the throughput of product development organizations and IT organizations is identical. When your developers multitask, they are misdirecting their own far more expensive capacity, away from generating throughput (in the form of completed projects that make money) and toward creating more and more work in process, WIP. The effect of this misdirection of capacity is that the organization’s throughput of completed projects is dramatically reduced.

By how much is throughput reduced? One software development organization, Confluence, increased its throughput from 6 completed projects per year to 11 completed projects per year. It did so simply by eliminating multitasking, without hiring a single additional developer. Another organization increased its throughput of completed projects from 5 per year to 16, again without hiring a single additional developer.

How did these organizations make such improvements? They changed their management process. That’s right, their management process. They replaced their antiquated management policies and practices, which were forcing their developers to multitask, with management policies and practices that enabled resource teams to achieve and sustain maximum throughput.

Today, most organizations simply continue to suffer largely unrecognized financial losses, which multitasking is inflicting upon their businesses. A few have taken action. They’ve undertaken a change process that has nearly doubled the real productivity of their product development or IT operations. Inevitably, other organizations will follow. But for the shareholders of those other organizations, improvement will be none too soon. †

Friday, November 05, 2004

[18] Anatomy of a Robust Project Plan




The next figure illustrates many of the features that make up a robust project plan. Specifically, the tasks of the primary sequence of the plan are identified, as are the tasks of the component sequences. Further, the component tolerances and, more importantly, the project tolerance are included as integral parts of the plan. Finally, the commitment date is clearly indicated, after having been selected so as to provide an appropriately high level of confidence that the project can be delivered on or before the commitment date.



However, the most important of these features cannot be shown by any figure. The most important features of a robust project plan are the plan’s four forms of accuracy: accuracy with respect to deliverables, …logistics, …duration, and …budget. These features are the result of discipline, your and that of your superiors.

This concludes the Robust Project Design content. As I stated clearly at the beginning, good project plans are absolutely necessary for successful operations, but they are in no way sufficient. Today, the widespread use of the matrix management model ensures that the single-project model is little more than a recipe for disappointment. For truly effective operations in product development today, we need a multiproject management approach. This will be the focus of the future writings.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues.Subscribe!]

Thursday, November 04, 2004

[17] The Critical Chain Model




In his 1997 work of fiction, Dr. E. M. Goldratt introduced the Critical Chain concept. The book’s publication triggered a seemingly unending discussion regarding the virtues of the Critical Chain versus those of the Critical Path. The discussions always focused on the one minor distinction between the definitions of Critical Chain and Critical Path, the fact that the definition of a project’s Critical Chain accounts for resource dependencies explicitly, whereas the Critical Path definition does not. This was most unfortunate. The more significant contribution of the Critical Chain model is that it is the first logistical model of projects that facilitates and even popularizes attempts to estimate and manage variation.

The Critical Chain of a project is defined as the longest sequence of dependent events that prevents the project plan from being any shorter. This differs from the popular definition of a Critical Path in that the Critical Chain definition, by stipulating dependent events, opens the door to resource dependencies as well as to precedence dependencies.

The Critical Chain definition is illustrated in the next figure. In that figure, each of the task-bar colors denotes a different skill. Further, for the sake of simplicity, the model in the figure presumes that only one person of each skill set is available.




The Critical Chain includes the tasks denoted by GR10, Blue30, Blue15, and Red15. The sequence of the tasks is determined as much by the limited availability of resources as it is determined by precedence dependencies. For this reason, the tasks in the Critical Chain typically span multiple paths of the project plan, a characteristic that at times creates a certain degree of visual confusion.

A more readily understood representation of the same project plan shows the Critical Chain tasks all at the same (middle) level. This “fishbone” representation (next figure), with each component sequence either above or below the Critical Chain, is most useful when determining where component tolerances are needed.

The Critical Chain model stipulates that a component tolerance is needed wherever a non critical component sequence provides an input for a critical chain task. The component tolerances are indicated by the shorter, blue line segments with round ends.

The commitment duration of the project is represented by the distance from the wall at the left of the figure (which represents the project's start date) to the yellow circle at the right of the figure. The long, blue bar at the right end of the figure denotes the project tolerance.



Notice that in this case the upper component sequence requires a tolerance that “stretches” the Critical Chain. That is, the component tolerance is large enough to create a so-called gap in the Critical Chain of the project. This apparent gap often becomes a point of contention for many managers. But the contention is created more by an insufficient understanding of variation and by a complete lack of understanding of multiproject operations. The latter causes many managers to conclude that all newly defined projects must be started immediately. In fact, within a properly managed multiproject operation, entire projects are scheduled so that they can start well after they are planned. Thus, in such an operation there is no “wall” that would prevent the entire upper sequence of tasks from being moved earlier (to the left), relative to the Critical Chain of the project.

Further, it is worth observing that the concept of a Critical Chain is no less deterministic than the concept of a Critical Path. The very concept of criticality, in fact, is a deterministic concept. Still, the complete Critical Chain model is considerably more useful than the Critical Path model, in large part because it brings with it an explicit reminder of two assumptions that are the foundation of all project models. First, the Critical Chain model assumes that resources work each task at a full level of effort, from start to finish. This runs contrary to the practice of institutionalized multitasking, which today is the most wasteful of management practices, as anyone even remotely aware of queuing theory can verify.

Second, the Critical Chain model assumes that downstream tasks are started as soon as their inputs are available, rather than being started at scheduled times. In other words, the behavior of the resources is event-driven rather than being date-driven, not unlike the behavior of the members of any relay race team.

The Critical Chain model is not the end-all of project models. A skilled statistician with a computer and an appropriately designed Monte Carlo package could do better than the Critical Chain model, but not by much and certainly not with the same minor level of effort needed to construct a Critical Chain project plan.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues.Subscribe!]

Saturday, October 30, 2004

[16] Tolerance Design




At this writing, three methods for calculating tolerances enjoy some measure of support. They are: the Cut & Paste method, the Control Chart method, and the Root Square Error method. We discuss these now.

The cut and paste method of determining component tolerances is popular among the followers of Dr. E. M. Goldratt. According to this method, a project manager gathers the inflated estimates of task duration typically provided by developers and cuts these in half. The model of the project, then, is based upon the reduced estimates of duration. The component tolerances are estimated as a percentage (usually 50%) of the deterministic estimates of duration of their respective sequences of tasks. The project tolerance is estimated as percentage of the deterministic estimate of the duration of the longest sequence of tasks. Goldratt's followers refer to this longest sequence as the critical chain.

What’s good about the cut and paste method? It is overwhelmingly simple to use, as it requires only grade-school arithmetic. This simplicity seems to be of paramount importance to Goldratt and many of his followers, since the method can be taught even to the highly uneducated among us.

What’s not so good about the cut and paste method? The cut and paste method provides a linear model of variation. Unfortunately, so far as sequences of tasks are concerned, variation does not add linearly – only variance adds linearly. Variation increases with the square root of the number of tasks in a sequence. Thus, the linear model provided by the cut and paste method is inconsistent with sound mathematics.

Worst of all, the cut and paste method appears to codify the very practice that for decades has plagued the developers of virtually every product development organization: managers reduce the estimates of duration provided by the developers. Thus, it destroys trust between developers and managers, rather than building trust. It alienates the very individuals whose behavior has a direct and significant impact on the logistical performance of a product development enterprise. This alone makes the cut and paste method undesirable.

A second approach is to use a control chart. By this approach, we simply graph normalized values of project duration on a control chart. The planned (baseline) duration estimates of the projects are used as the normalizing values. For example, a project that had a planned duration of 100 business days and an actual duration of 140 business days would be represented in the control chart with a normalized duration of 1.4. The difference between the control limit and the mean of the normalized duration values serves as the basis for calculating subsequent project tolerances.

What’s good about the control chart method? It captures all the variation in project duration exhibited by an organization. Thus, the method gives us an accurate estimate of the required tolerance value.

What is unacceptable about the control chart method? Today, the resulting tolerance calculations would be impractically large and entirely unacceptable for all concerned. At this writing, finding even one product development organization that can be considered in a state of statistical control would be a very daunting task. The degree of variation in project duration exhibited by virtually every product development organization is unpredictable and astronomically large. Therefore, the control chart method, while sound and reliable, is simply impractical at this time. Perhaps it will be in use by the time that the two of you become interested in the subject of this book. I can only hope that the state of project management improves before then.

The third method for calculating tolerance values is called the Root-Square-Error (RSE) method. The RSE method is the same mathematically valid method that has been used by engineers for many decades, for the tolerance design of physical products. It is directly adaptable to the tolerance design of projects.

The RSE method is illustrated in the next figure. In support of the RSE method, each developer provides two estimates of duration per task. First the developer provides an estimate that corresponds to a high level of confidence. We call this the “safe” estimate. Then, the developer provides an estimate of the mean process time. We call this the “average” estimate. The difference, D, between safe and average estimates for each task gives us a measure of the expected variation for the task. The component tolerance is calculated as the square root of the sum of the squares of the differences, for the tasks in each component sequence. The same calculation also provides an estimate of the project tolerance, with the difference values being those that correspond to the tasks of the primary sequence in the project.



A sample calculation is provided in the next figure. For that example, the sum of the differences squared equals 758 business days. The corresponding project tolerance is 28 business days. Notice that the 28-day tolerance value provides a commitment duration that corresponds to a comfortably high confidence level for the entire project.



Project tolerances calculated with the RSE method should be considered the absolute minimum values, since the RSE method takes into account only task-level variation.

Further, since the amounts of variation in project duration experienced today by virtually all product development organizations are tremendous, the tolerance values calculated with the RSE method are as inappropriately small as the values calculated by any other practical method. Given this observation, it remains for us to choose the one method that gives us the greatest benefit with the least amount of harm. For me, this is the RSE method, because it gets developers involved in the process of constructing the models of our projects. Developers contribute the two estimates of duration for each task; their contributions enhance trust, rather than destroying trust.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues.Subscribe!]

Thursday, October 21, 2004

[15] Diamond-Shaped Networks




Frequently enough we encounter a logistical network with a diamond structure to it. This happens when a task in the primary sequence provides inputs to two successors, one of which is in the primary sequence and the second is in a component sequence. This is illustrated in the next figure, which shows the primary sequence in red and the component sequence in blue.



The diamond structure doesn’t appear to be a problem, until we calculate the component tolerance and insert it into our model of the network. When we do this, we end up with a significant gap in the primary sequence of the network. This is illustrated in the next figure.

The gap is created by two factors. First, the size of the component tolerance, which is based on the variation associated with the entire component sequence, is large. Second, the precedence dependency between the last task in the component sequence and its predecessor in the primary sequence prevents the component sequence from moving to the left. Consequently, when we include the component tolerances in our model of the project, we end up with a what appears to be a most discomforting gap in the primary sequence.



The knee-jerk response of most project managers today is to force the gap to vanish. Unfortunately, the resulting model ignores completely the strong interaction between variation and the parallel structure. As such, the resulting model is overwhelmingly wrong; it grossly underestimates the duration of the project; and it misleads project managers and decision-makers into making commitments that cannot be met by the resources of the enterprise. But, the resulting picture is strikingly comforting for those who lack any understanding of variation, despite the fact that accuracy relative to duration is destroyed.



There is a solution, of course. Rather than eliminating the gap arbitrarily, we can move the earlier part of the component sequence to the left. Specifically, we move early the portion of the component sequence that precedes the problem dependency. By doing so, we uncouple most of the component sequence from the diamond-shaped feature of the network, and we diminish the magnitude of the interaction effect. This is shown in the next figure.



However, this tactic does not allow us to eliminate the gap entirely. If we did so, we too would be ignoring the interaction between variation and the parallel structure. Instead, our robust project design tactic lets us reduce the magnitude of the interaction, which we model with a smaller but finite gap. The smaller gap (shown in the next figure) provides a correction factor that at this time we can only estimate.



How should we estimate the magnitude of the correction factor? At this writing, the most practical way to estimate it is simply by calculating a component tolerance for the parallel segments that are involved directly in the diamond-shaped structure. This gives us smaller component tolerances, which in turn create a smaller gap. But we can do this only in cases where we can uncouple the earlier segment of the longer component sequence.

[So that others also might subscribe to The Project Management Soap Box, please share the following link with friends and colleagues. Subscribe!