Complacency: Laziness, or Learned?

Posted by The ATC Freq on March 5, 2010 at 9:57 pm. One comment

After a recent incident gained media attention, there were accusations that the FAA and its air traffic controllers had grown complacent in regards to safety.

The latest incident involved a veteran controller at New York’s JFK airport, who had his children relay some air traffic clearances on the radio frequency.

The JFK incident was the third in a string of recent air traffic control related incidents that made the headlines, including last summer’s mid-air collision near Teterboro airport in New Jersey of two VFR aircraft, as well as the incident last fall where Northwest 188 lost contact with air traffic control and eventually overflew its destination.

Always ready to put on the proper face to the media, the top levels of FAA management reacted to the latest incident with shock and outrage:

“This lapse in judgment not only violated FAA’s own policies but common sense standards for professional conduct. These kinds of distractions are totally unacceptable,” administrator Randy Babbitt said in the statement.

…(violations) of FAA’s own policies…“?  “…standards for professional conduct…“?  Really, Mr. Babbitt?!

Let’s examine some facts about the three incidents:

In the case of the mid air collision the supervisor was on the clock but out of the facility running personal errands, which were apparently more important than his job.  In the case of Northwest 188, several managers decided to simply ignore orders.  And in the latest case, one or more supervisors apparently allowed an employee on two different days to let his children talk to airplanes on the radios.

In each and every incident, there was an FAA manager involved that wasn’t following the rules.  Do you think that’s just a coincidence?

The managers are the people supposed to be ensuring that the workers are following the rules, and what are they doing?  They’re breaking the rules themselves!

Does anyone remember this video?

The conduct of some of those managers is a violation of FAA Human Resources Policy which states in part (my emphasis):

An employee’s conduct on the job has a direct bearing on the proper and effective accomplishment of official duties and responsibilities. Employees are expected to approach their duties in a professional and business like manner and maintain such an attitude throughout the workday. It is also expected that employees will maintain a professional decorum at all times while in a temporary duty travel status or otherwise away from their regularly assigned post of duty, such as telecommuting, whether at home or at a telecommuting site, or attending training.

So much for following the rules and the higher standard FAA managers are allegedly held to.

Do you think a controller or two might have noticed managers in those cases intentionally violating FAA policies and acting unprofessionally?  Do you think they didn’t notice nothing happened to any of them for doing so?

Isn’t this called, “setting an example”?

Last year I wrote about the FAA’s “dumb luck” approach to safety, including the “customer service initiative” that ultimately led to two FAA safety inspectors turning into whistle-blowers when FAA managers ignored their concerns about problems with Southwest Airlines’ maintenance.

It was clear then that the FAA was only concerned about safety when the problems hit the headlines.

Then the FAA decided to reclassify air traffic control errors, turning many errors into non-events (and making it appear to the flying public we were having fewer errors).

They created a safety program (ATSAP) that allows controllers to anonymously report errors without fear of punishment, but which in turn also masks and allows the FAA to ignore many systemic problems.

Currently the FAA is testing its ERAM software, even with its many known bugs, on live air traffic.

And almost every time something makes it into the news that involves the FAA, an FAA spokesman quickly says, “Safety was never compromised.”

The FAA claims it’s an organization that’s passionate about safety, but there’s little to indicate it’s actually doing much to improve safety at all.  If anything it’s degrading safety more often than not.  It says one thing but does another.

So between managers not following FAA rules, and the many changes to FAA policies and procedures regarding air traffic safety and error reporting, should it really be a surprise that controllers may have gotten complacent?

And if they are complacent, aren’t they really just following direction and examples from the FAA management team?

More Signficant ERAM Problems

Posted by The ATC Freq on February 24, 2010 at 3:39 pm. 15 comments

Salt Lake Center (ZLC) reverted back to the HOST computer system last night due to major problems after starting an ERAM run last week that was supposed to be permanent.

I’m sure the FAA and the contractor Lockheed Martin will write it off as just another “glitch” (i.e. part of the development cycle), but it’s another glaring demonstration of how unreliable the ERAM software still is, even though the FAA continues to test it on live traffic, expecting air traffic controllers to simply work around its many problems and keep aircraft safely separated nonetheless.

ZLC started running ERAM on what was supposed to be a permanent basis on the morning of Wednesday, February 17.

They had previously completed an an eight day test that ended the first week of February, followed by a two week delay in which Lockheed Martin was supposed to correct the (known) bugs in the software before ZLC began using the new version permanently.

The latest failure shows that in spite of the software updates that obviously ERAM still has a long way to go before it’s fit to use on live traffic 24/7.

Notably the event marking the first enroute center to transition to ERAM full time came and went quietly.  Instead of calling in the media and having a press release (and having sheet cake), the FAA barely noted the occasion.

The complete lack of fanfare noting the first enroute center to start running ERAM full time shows that the FAA knows full well how unreliable/unstable the ERAM software still is.  At this point it’s clear they’re making deliberate efforts to not call any attention to the ERAM project.

After lots of boastful press from the FAA over ERAM early last year, including statements of how the program was on budget and ahead of schedule (even though it wasn’t), the FAA abruptly stopped talking about ERAM after significant problems running it at ZLC in a test last fall.

The FAA apparently learned its lesson then and now isn’t going to mention ERAM at all, instead choosing to continue testing and deploying ERAM quietly and keeping its fingers crossed that it won’t cause a news event.

Every time the FAA and Lockheed Martin complete another test without significant problems they seem to convince themselves the project is doing just fine.  After the eight day ZLC test they were convinced the software was ready for permanent use after just a little “tweaking”, even though it’s now clear that was far from the truth.

Last fall one of the problems that resulted in the aborted ZLC test was datablocks (the tag that displays the aircraft call sign and altitude as well as other information) wouldn’t track properly and sometimes ended up tagging up on the wrong target.

Guess what?  That problem still exists many months (and many updates) later.

The data block/tracking functionality is fundamental to an air traffic display system and is thereby safety-critical.  It’s disturbing that at this stage this basic functionality is still so unreliable in ERAM.

This may not be simply due to software bugs either; there may be some significant problems with the software tracking algorithms within ERAM, which from what I’ve heard are radically different from those used in the HOST computer system.

Here’s a list of some of the latest bigger problems with ERAM (and note that some of them, especially the tracking problems, aren’t new):

Interim Flight Plans – If a controller starts an interim flight plan (datablock only, no beacon code or routing) ERAM aggressively searches for the first target of opportunity to track. It may be a primary, or a beacon belonging to another aircraft.
Track Un-Pairing – Arbitrarily the datablock will disassociate from the beacon target. We are unable to determine what seems to cause it. We looked at RADAR sort boxes and ASR terminal RADAR feeds, and who knows what else. ERAM will not automatically re-pair the datablock and the target like HOST does. We see this happen frequently around SLC where limited datablocks create a bright large yellow spot over the airport. You can’t shut them off and it is easy for the un-paired datablock to disappear into the blob.
Track Swap – We had some instances of departures where ERAM switched datablocks on aircraft on completely different routes and entering different sectors.
Bogus Beacon Codes – Frequently ERAM will flash in the third line a bogus beacon code (like the aircraft is squawking an incorrect code) for one sweep and then it disappears.
Track Pairing – If ERAM associates a full datablock with an incorrect beacon, you have to track the datablock at least 32 miles away from the incorrect beacon for ERAM to accept the disassociation. Approximately 30 seconds has to pass before you can pair it with the correct beacon target.
Bogus Alerts – We see significant numbers of bogus alerts; MSAW, conflict probe in EDST (URET replacement), aircraft working is SUAs.
Inter Facility Handoffs to Vertically Stratified Sectors – If an aircraft changes altitude 30 minutes prior to exiting the facility, and the new altitude causes the aircraft to enter a different sector in the receiving facility, ERAM will hand the aircraft to the incorrect sector if you use the auto addressed handoff option (single alpha character followed by CID). You have to manually address the handoff to the correct sector.

Apparently the latest software version yet to be put into use isn’t intended to fix many of the aforementioned problems either; instead it addresses other bugs.

It will be interesting to see how the latest episode affects the entire ERAM project.

One way or the other it’s going to result in the project falling further behind schedule.

But I doubt very much that it will convince the FAA to stop testing the software on live traffic.

Dodging Responsibility for ERAM

Posted by The ATC Freq on February 17, 2010 at 5:59 pm. One comment

FAA management doesn’t really like to make decisions.  That’s primarily because they know if they make decisions they can potentially be held accountable for those decisions.

So they’ve decided that the safest tact is to avoid making decisions whenever possible.

If those same managers started as air traffic controllers, part of the reason many of them quit working airplanes and took management positions so quickly after completing training is that because many didn’t like making lots of real time decisions as controllers they knew they would be held accountable for later.  (That and most of them weren’t very good at making decisions as air traffic controllers.)

As an example, a few years ago I vividly remember witnessing a conversation a since-retired controller had with his immediate supervisor (front line manager or FLM), as it was a classic exchange between and FAA manager and employee looking for policy guidance about how to do his job.

At the time there were four or more different position relief briefing checklists available to controllers, all of them slightly different.

The position relief checklist is used when one controller takes over a position from another controller (used during a position relief briefing), and is intended to make sure all the necessary information is passed on from one controller to the next and nothing is omitted.  The position relief briefing is a safety-critical process.

Since there were so many different position relief checklists, the controller asked his supervisor what should have been a simple question:  which checklist was he was supposed to be using?

The manager’s answer to his question? “This is the one I use.”

Since that wasn’t a real answer to his question, the controller rephrased, asking, “So that is the checklist I am supposed to use?”

The response from the manager was the same carefully worded, intentionally evasive, “This is the one I use.”

Since he wasn’t getting a definitive answer the controller asked the same question repeatedly and the manager repeated the same response several times.  When it was obvious he wasn’t going to get a real answer to his question, the controller gave up.

In the case of a question about a procedure controllers use many times throughout the course of their work day, a controller couldn’t get definitive guidance from his own supervisor.

This is the kind of work direction controllers routinely get from FAA management.  Controllers are often left to figure problems out on their own because FAA management refuses to give real guidance on procedures and policies.

The management culture in the FAA has proven time and time again that they’re all about dodging responsibility and are actually averse to making decisions.

So it came as no surprise last fall when once the FAA realized how badly the ERAM project was going they decided it was time to bring the controller’s union (NATCA) on board with the project.  It was time for FAA management to start transferring responsibility for ERAM to the controller workforce and its union.

Before that, the FAA had been going at the project solo, mostly because they realized that the union would likely object to too many parts of the project (primarily the fact that ERAM was so bug-ridden it wasn’t fit for use) and addressing the many problems would result in it taking longer to deploy.  The FAA is/was more concerned about delays with the ERAM program than with whether or not it actually works.

But at the beginning of December the FAA was savvy enough to sign a Memorandum of Understanding (MOU) with the union on the ERAM Implementation and Deployment.

The subject of the MOU came up recently because I was talking to one of our ERAM cadre (one of the controllers helping test ERAM), who was under the impression that the union ERAM cadre members had the authority to halt the ERAM project for safety concerns.  The MOU however, does not give the union that authority.

Unfortunately for the union and its membership, it’s not a very good MOU, and the way it’s written the union has instead now positioned itself as a patsy or fall-guy for the shortcomings of the project.

Granted, as it was getting “invited” to be involved with the ERAM project at the eleventh hour was pretty much a no-win situation for the union and its members.

If the union stood by and didn’t get involved with ERAM, the FAA would portray it as being part of the problem.  It wouldn’t be in the best interest of the union to be seen as obstructionist with respect to new technology that’s part of NextGen, and anything that helps controllers do their jobs better is ultimately in the best interest of the bargaining unit.  Additionally controllers need to be involved in developing those tools too so that they’re useful, in spite of the fact that it’s clear that the FAA isn’t too concerned about that.

But the MOU is weak because it doesn’t give the union any real authority with respect to the ERAM project.

It’s filled with clauses like “…shall be provided the opportunity to comment…” and “…shall be allowed to collaborate…”, “…provided the opportunity to participate…”, “…to make recommendations…”, “…prioritize problems…”, “…to attend briefings…” and “…to participate…”.

Nowhere in the MOU does it say that the union has the authority to stop the ERAM project to force the FAA and/or the developer (Lockheed Martin) to correct significant safety problems with it.

Instead it says:

“If the Union reasonably believes that ERAM is having an adverse impact on the National Airspace System, it shall expeditiously provide its concerns and basis for them to the Agency.  The Agency shall expeditiously evaluate the Union’s concerns and convey any proposed actions regarding their disposition to the Union.”

So in other words, the union can submit its concerns about the project to the FAA.  But the FAA is in no way obliged to actually do something about the concerns.

Given the fact that the FAA has repeatedly demonstrated its willingness to test the ERAM software on the flying public even with its significant known bugs, coupled with its normal lackadaisical approach towards safety related issues, makes it doubtful that any concerns the union has with ERAM would halt or delay the program now.

If the FAA was truly interested in collaborating with the union and its membership it would have brought them in on the ERAM project years earlier.  Instead, they waited until the last possible moment to sign an MOU, and only then as a damage control measure.

If ERAM ultimately fails or results in a media event, the union will now be a partner in that failure.  If it’s a success, the FAA will take credit for the entire project.

Although I normally don’t give the FAA any credit for being that savvy, in the case of the ERAM MOU, the FAA positioned itself perfectly to avoid responsibility for any failures of the project.  It doesn’t come as any surprise though, because after all, that’s what FAA management is really good at.