This article reviews the common causes of column malfunctions. It then looks at the basic troubleshooting tools: the systematic strategy for troubleshooting distillation problems and the dos and don'ts for formulating and testing theories.

Distillation operation and troubleshooting follow an analogous axiom "80% of the business is brought in by 20% of the customers". A person engaged in operating or troubleshooting distillation columns must develop a good feel and understanding for the factors that cause the vast majority of column malfunctions. For these factors, this person must be able to distinguish good from poor practices and correctly evaluate the ill effects of poor practices and their relevance to the assignment at hand. While a good knowledge and understanding of the broader field of distillation will be beneficial, the troubleshooter can often get by with a shallow knowledge of this broader field. It is well accepted that troubleshooting is a primary job function of operating engineers and supervisors. Far too few realize that distillation troubleshooting starts at the design phase. Any designer wishing to achieve a trouble-free column design must be as familiar with troubleshooting and operation as the person running the column. Prior to embarking on this journey, it is necessary to define the problem areas and examine the tools available for uncovering malfunctions.

Causes of Column Malfunctions

Historical data of malfunctioning columns were extracted from the literature and suggest the following:

  • Instrument and control problems, startup and/or shutdown difficulties, and malfunctioning column internals are the major single causes of column malfunctions. Among them, they make up more than half of the reported incidents. Familiarity with these problems, therefore, constitutes the "bread and butter" of persons involved in troubleshooting and operating distillation and absorption columns. For this reason, these topics receive primary emphasis in the industry.
  • Reboilers, condensers, and operation difficulties amount to about half of the remaining problems. Thus three out of four incidents are caused by either these or the factors previously mentioned. Familiarity with these problems, therefore, is of great importance to persons involved in distillation and absorption operation and troubleshooting.
  • Primary design problems, foaming, installation mishaps, relief problems, and tray and downcomer layout problems make up the rest of the column malfunctions. Familiarity with these problems is useful to troubleshooters and operation personnel, but only one incident out of every four is likely to be caused by one of these factors.
  • Primary design is an extremely wide topic, encompassing vapor-liquid equilibrium, reflux-stages relationship, stage-to-stage calculations, unique features of multicomponent distillation, tray and packing efficiencies, scale-up, column diameter determination, flow patterns, type of tray, and size and material of packing. This topic occupies the bulk of most distillation problems, and perhaps represents the bulk of our present distillation know-how.

While this topic is of prime importance for designing and optimizing distillation columns, it plays only a minor role when it comes to distillation operation and troubleshooting. The above statements must not be interpreted to suggest that operation personnel and troubleshooters need not be familiar with the primary design. Quite the contrary. A good troubleshooter must have a solid understanding of primary design because it provides the foundation of our distillation know-how. However, the above statements do suggest that in general, when a troubleshooter examines the primary design for the cause of a column malfunction, he or she has less than one chance out of ten of finding it there.

Column Troubleshooting - A Case History

In section 3 below, the systematic approach recommended for tackling distillation problems is mapped out. The recommended sequence of steps is illustrated with reference to the case history described below.

The following story is not a myth; it really happened. One morning as I sat quietly at my desk in corporate headquarters, the boss dropped by to see me. He had some unpleasant news. One of the company's refinery managers was planning to visit our office to discuss the quality of some of the new plants that had been built in his refinery. As an example of how not to design a unit, he had chosen a new gas plant for which I had done the process design. The refinery manager had but one complaint: "The gas plant would not operate."

I was immediately dispatched to the refinery to determine which aspect of my design was at fault. If nothing else, I should learn what I did wrong so as not to repeat the error.

Upon arriving at the refinery, I met with the operating supervisors. They informed me that, while the process design was fine, the gas plant's operation was unstable because of faulty instrumentation. However, the refinery's lead instrument engineer would soon have the problem resolved.

Later, I met with unit operating personnel. They were more specific. They observed that the pumparound circulating pump (see Fig. 1) was defective. Whenever they raised hot oil flow to the debutanizer reboiler, the gas plant would become destabilized. Reboiler heat-duty and reflux rates would become erratic. Most noticeably, the hot-oil circulating pump's discharge pressure would fluctuate wildly. They felt that a new pump requiring less net positive suction head was needed.

Both these contradictory reports left me cold. Anyway, the key to successful troubleshooting is personal observation. So I decided to make a field test.

When I arrived at the gas plant, both the absorber and debutanizer towers were running smoothly but not well. Figure 2 shows the configuration of the gas plant. The debutanizer reflux rate was so low it precluded significant fractionation. Also, the debutanizer pressure was 100 psi below design. Only a small amount of vapor, but no liquid, was being produced from the reflux drum. Since the purpose of the gas plant was to recover propane and butane as a liquid, the refinery manager's statement that the gas plant would not operate was accurate.

Hot oil from the fractionator supplies heat to gas-plant reboilers.jpeg

Figure 1 - Hot oil from the fractionator supplies heat to gas-plant reboilers

Leaking debutanizer reboiler upsets gas plant.jpeg

Figure 2 - Leaking debutanizer reboiler upsets gas plant

As a first step, I introduced myself to the chief operator and explained the purpose of my visit. Having received permission to run my test, I switched all instruments on the gas-plant control panel from automatic over to local/manual. In sequence, I then increased the lean oil flow to the absorber, the debutanizer reflux rate, and the hot-oil flow to the debutanizer reboiler.

The gas plant began to behave properly. The hot-oil circulating pump was putting out a steady flow and pressure. Still, the plant was only producing a vapor product from the debutanizer reflux drum. This was because the debutanizer operating pressure was too low to condense the C3-C4 product. By slowly closing the reflux drum vapor vent valve, I gradually increased the debutanizer pressure from 100 psig toward its design operating pressure of 200 psig.

Suddenly, at 130 psig the hot-oil flow to the debutanizer's reboiler began to waiver. At 135 psig, the debutanizer pressure and the hot-oil flow plummeted. This made absolutely no sense. How could the debutanizer pressure influence hot-oil flow?

To regain control of the gas plant, I cut reflux to the debutanizer and lean-oil flow to the absorber. I was now back where I started. The thought of impending of failure loomed. I repeated this sequence twice more. On each occasion, all went well until the debutanizer pressure was increased. By this time it was 3 a.m. Was it also time to give up and go home?

Just then, I noticed a commotion at the main fractionator control panel. The operators there stated that the fractionator was flooding again - for the third time that night. The naphtha production from the fractionator had just doubled for no apparent reason.

In every troubleshooting assignment there always occurs that special moment, the moment of insight. All of the bits and pieces fall into place, and the truth is revealed in its stark simplicity.

I cut the debutanizer pressure back to 100 psig and immediately the flooding in the main fractionator subsided. The operators then closed the inlet block valve to the hot-oil side of the reboiler and opened up a drain. Naphtha poured out instead of gas oil. This showed that the debutanizer reboiler had a tube leak.

Whenever the debutanizer pressure reached 130 psig, the reboiler pressure exceeded the hot-oil pressure. The relatively low-boiling naphtha then flowed into the hot oil and flashed. This generated a large volume of vapor that then backed hot oil out of the reboiler. The naphtha vapors passed on into the main fractionator and flooded this tower. Thus, the cause of the gas plant instability was neither a process design error, instrument malfunction, nor pumping deficiency. It was a quite ordinary reboiler tube failure.

3. Strategy for Troubleshooting Distillation Problems

In almost any troubleshooting assignment, it is desirable to solve a problem as rapidly as possible with the least amount of expenditure. In a surprisingly large number of cases, this objective is only partially achieved. One of the major obstacles to achieving this objective is a poor (often nonexistent) strategy for tackling the problem.

When devising a troubleshooting strategy, it is useful to think in terms of a "doctor and patient" analogy. The doctor's troubleshooting strategy in treating a patient is well-established and easily understood by most people. Applying similar principles to solving distillation problems can often map out the most effective and least expensive course of action.

The sequence of steps below is often considered optimum for tackling a troubleshooting problem. A good troubleshooting strategy always proceeds stepwise, starting with the simple and obvious.

  1. Assess the safety or environmental hazard that the problem can create. If a hazard exists, an emergency action is required prior to any troubleshooting efforts. In terms of the medical analogy, measures to save the patient or prevent the patient's problem from affecting others have priority over investigating the cause of the problem.
  2. Implement a temporary strategy for living with the problem. Problem identification, troubleshooting, and correction take time. Meanwhile, adverse effects on safety, the environment, and plant profitability must be minimized. The strategy also needs to be as conducive as practicable for troubleshooting. The strategy, and the adverse effects that are to be temporarily tolerated (e.g., instability, lost production, off-spec product), usually set the pace of the troubleshooting investigation.

In the debutanizer case history, the short-term strategy was to run the column at a pressure low enough to eliminate instability and to tolerate an off-spec bottom product. In the medical analogy, the short-term strategy is hospitalization, or going to bed, or just "taking it easy." This strategy usually sets the urgency of treatment.

  1. Obtain a clear, factual definition of the symptoms. A poor definition of symptoms is one of the most common troubleshooting pitfalls. In the debutanizer case history above, the following definitions were used by different people to describe the symptoms of a reboiler tube leak problem:
  • "The gas plant would not operate."
  • "The gas plant's operation is unstable because of faulty instrumentation. However, the problem will soon be resolved by the instrument engineer."
  • "The oil circulating pump is defective. Whenever the oil flow to the reboiler is raised, reboiler heat duty and reflux rate would become erratic, and the pump's discharge pressure would fluctuate wildly. A new pump requiring less net positive suction head is needed."
  • "The column was running smoothly but not well. Reflux rate was too low, so it precluded significant fractionation. The column pressure was 100 psi below design. Only a small amount of vapor, but no liquid, was being produced from the reflux drum, which should have produced mainly liquid. Other problems noticed by plant personnel are as described above."

The above represents a typical spectrum of problem definitions. The last definition, supplied by a troubleshooting specialist, can clearly be distinguished. The first two definitions were nonspecific and insufficiently detailed. The third described part of the story, but left out a major portion. The first three definitions also contained implied diagnoses of the problem, none of which turned out to be correct.

  1. Examine the column behavior yourself. This is imperative if the problem definition is poor. In the debutanizer example above, the troubleshooter would have been oblivious to a major portion of the problem definition had he based his investigation entirely on other people's observations. Some communication gap always exists between people, and it is often hard to bridge.

In some circumstances, it may be impractical or too expensive for the troubleshooter to visit the site (e.g., a column located on another continent). In this case, the troubleshooter must be in direct (i.e., phone) communication with the operating person, who should be entirely familiar with the column, its operation, and its history. The problem definition in this case must be particularly sharp.

  1. Learn about the column history. The question, "what are we doing wrong now that we did right before?" is perhaps the most powerful troubleshooting tool available. If the column is new, closely examine any differences between the column and columns used for identical or at least similar services. In addition, examine any differences between the expected and the actual performance. Each difference can provide a major clue. In the debutanizer example above, the troubleshooter included a comparison to design performance in the problem definition (he was working with a new column).

Digging into the past may also reveal a recurring ("chronic") problem. If so, finding the correct link between the past and present circumstances can be very illuminating. Be cautious when identifying the link; a new problem may give the same symptoms as a past problem but be caused by an entirely different mechanism.

  1. Search and scan events that occurred when the problem started. Carefully review operating charts, trends, computer, and operator logs. Establish event timing in order to differentiate an initial problem from its consequences. Include events that may appear completely unrelated, as these may be linked in an obscure manner to the problem. In the debutanizer example, it was the observation that flooding in the
    fractionator coincided with the debutanizer becoming unstable that gave the troubleshooter the vital clue. At first glance, the two appeared completely unrelated.
  2. Listen to shift operators and supervisors. Experienced people can often spot problems, even if they cannot fully explain or define them. Listening to those people can often provide a vital clue. In the debutanizer example, some of the important observations were supplied by these people.
  3. Do not restrict the investigation to the column. Often, column problems are initiated in upstream equipment. Doctors frequently look for clues by asking patients about people they have been in contact with or their family health history.
  4. Study the behavior of the column by making small, inexpensive changes. These are particularly important for refining the definition of symptoms, and they may contain a vital clue. Record all observations and collect data; these may also contain a major clue, which can easily be hidden and become forgotten as the investigation continues. In the debutanizer example, the troubleshooter increased column pressure and watched its behavior. This led him to the observation that the debutanizer pressure affected oil flow - a major step in refining the problem definition.
  5. Take out a good set of readings on the column and its auxiliaries, including laboratory analyses. Misleading information supplied by instruments, samples, and analyses is a common cause of column malfunctions. Always mistrust or suspect instrument or laboratory readings, and make as many crosschecks as possible to confirm their validity. Instruments may malfunction even when the instrument technician can swear they are correct. In one example an incorrect pipe design caused an erroneous reading of a reflux flow meter. Survey the column piping for any unusual features such as poor piping arrangement, leaking valves, "sticking" control valves, and valves partially shut. Compile mass, component, and energy balances; these function as a check on the consistency of instrument readings and the possibility of leakage. This step is equivalent to laboratory tests taken by a doctor on the patient. Scan the column drawings carefully for any unusual features. Check the column internals against good design practices, and determine whether any have been violated. If so, examine the consequences of such violation and its consistency with the information. Carry out a hydraulic calculation at test conditions to determine if any operating limits are approached or exceeded. If a separation problem is involved, carry out a computer simulation of the column; check against test samples, temperature readings, and exchanger heat loads.

Dos and Don'ts for Formulating and Testing Theories

Following the previous steps, a good problem definition should now be available. In some cases (e.g., the debutanizer), the cause may be identified. If not, there will be sufficient information to narrow down the possible causes and to form a theory. In general, when problems emerge, everyone will have a theory. In the next phase of the investigation, these theories are tested by experimentation or by trial and error. The following guidelines apply to this phase:

  1. Logic is wonderful as long as it is consistent with the facts and the information is good.
  2. When formulating a theory, attempt to visualize what is happening inside the column. One useful technique is to imagine yourself as a pocket of liquid or vapor traveling inside the column. Keep in mind that this pocket will always look for the easiest path. Another useful technique is to think of everyday analogies. The processes that occur inside the column are no different from those that occur in the kitchen, the bathroom, or in the yard. For instance, blowing air into a straw while sipping a drink will make the drink splash all over; similarly, a reboiler return nozzle submerged in liquid will cause excessive entrainment and premature flooding.
  3. Do not overlook the obvious. In most cases, the simpler the theory, the more likely it is to be correct.
  4. An obvious fault is not necessarily the cause of the problem. One of the most common troubleshooting pitfalls is discontinuing or retarding further troubleshooting efforts when an obvious fault is uncovered. Often, this fault fits in with most theories, and everyone is sure that the fault is the cause of the problem. The author is familiar with many situations where correcting an obvious fault neither solved the problem nor improved performance. Once an obvious fault is detected, it is best to regard it as another theory and treat it accordingly.
  5. Testing theories should begin with those that are easiest to prove or disprove, almost irrespective of how likely or unlikely these theories are. If it is planned to shut the column down, and shutting it down is expensive, it is often worthwhile to cater to a number of less drastic theories even if some are longer shots.
  6. Refrain from making any permanent changes until all practical tests are done.
  7. Look for possibilities of simplifying the system. For instance, if it is uncertain whether an undesirable component enters the column from outside or is generated inside the column, consider operating at total reflux to check it out.
  8. Do not overlook human factors. Other people's reasoning is likely to differ from yours, and they will act based on their reasoning. The more thoroughly you question their design or operating philosophy, the closer you will be able to reconstruct the sequence of events leading to the problem. In many cases, you may also discover major considerations you are not aware of.
  9. Ensure that management is apprised of what is being done and is receptive to it. Otherwise, some important nontechnical considerations may be overlooked. Further, management is far less likely to become frustrated with a slow-moving investigation when it is convinced that the best course of action is being followed.
  10. Involve the supervisors and operators in each "fix." Whenever possible, give them detailed guidelines of an attempted fix, and leave them with some freedom for making the system work. There have been many cases where actions of a motivated operator made a fix work, and other cases where a correct fix was unsuccessful because of an unmotivated effort by the operators.
  11. Beware of poor communication while implementing a "fix." Verbal instruction, rush, and multidiscipline personnel involvement generate an atmosphere ripe for communication problems. Ensure any instructions are concise and sufficiently detailed. If leaving a shift team to implement a fix by themselves, leave written instructions. Be reachable and encourage communication should problems arise. Call in at the beginning of the shift to check if the shift team understood your instructions.
  12. Recognize that modifications are hazardous. Many accidents have been caused by unforeseen side effects of even seemingly minor modifications. Ban "back of an envelope" modifications, as their side effects can be worse than the original problem. Properly document any planned modification, and have a team review it systematically with the aid of a checklist such as a "hazop" checklist. Before completion, inspect to ensure the modification was implemented as intended.
  13. Properly document any fix which is being adopted, the reasons for it, and the results. This information may be useful for future fixes.

* Reproduced from Distillation Operation by Henry Kister