ERAM Mitigation Schemes

There are a lot of center/enroute air traffic controllers that are interested in what’s happening with ERAM, including those at facilities that have already run ERAM on live traffic.

They’ve heard some vague stories of problems with the system, but very little details.

Salt Lake Center (ZLC) ran an eight day test that ended last week, and when it concluded we heard…nothing…

My facility, Minneapolis Center (ZMP) leapfrogged Seattle Center (ZSE) as an ERAM key site, becoming an “alternate key site” for a variety of reasons.  One of the reasons was because of the upcoming Winter Olympics in Vancouver, Canada which starts February 12, 2010 and is going to impact ZSE’s traffic.

In the off chance you’re still buying the FAA’s line that ERAM is “on time and on budget”, one of the reasons ZMP became an alternate key site was because ERAM isn’t up and running at ZSE like it was supposed to be.  And since the FAA is really motivated by avoiding bad headlines, they decided that an ERAM debacle during the international travel to and from the 2010 Winter Olympics might not be such a great idea.

But they are willing to test the ERAM system on live traffic elsewhere, including at Salt Lake Center, and Minneapolis Center.

Just for the record, now they’re apparently calling usage of ERAM on live traffic, “runs”  instead of “tests”, mostly because it sounds bad that the FAA is beta testing software on the American flying public.

ZMP went went Initial Operational Capability (IOC) last Friday but since there were no major incidents during that “run”, details about problems were tough to come by.

I think there are several reasons for that information void.

First, the controllers who are helping Lockheed Martin test ERAM (the cadre) have gotten so used to bugs they’ve become desensitized to them.  Unless a bug is so big it causes an instant crash (and there are some of those), they’re barely notable; just another glitch.

Secondly there aren’t really that many controllers that have been exposed to ERAM running on live traffic.  Although we’ve all been trained, that was done in the simulation lab, and a lot of the problems they’re now having didn’t occur in the lab.

Finally, a lot of controllers tend to be pretty myopic about things; if something doesn’t affect them directly they don’t think about it much.  Even though there are lots of ERAM bugs, the controllers that have worked ERAM on live traffic have done so in a very controlled and sterile environment and often with very little traffic so most haven’t experienced the problems firsthand.

The FAA also isn’t going out of its way to publicize problems with ERAM either.  Although I think there are managers secretly panicking about the progress of ERAM (or should be anyway), I don’t think that’s going to stop them from pushing ahead with the project.  Outwardly the FAA’s attitude is that ERAM development is moving along nicely albeit with some minor problems.

Prior to ZMP’s ERAM IOC “runs” they’ve been briefing the controllers about what to possibly expect, in addition to telling them to just leave ERAM alone as much as possible and don’t perform any functions they don’t have to.

During the second ZMP IOC “run”, that occurred Monday night this week, controllers got two pages of paper.  One was titled in part the “ZLC ERAM Mitigation Strategies”, and the second was “ZMP ERAM PR’s Not Resolved” (PR is a Problem Report).

Not surprisingly, the “ZLC ERAM Mitigation Strategies” was a list of workarounds for known problems.  The FAA has always loved workarounds, as they’re the easiest, quickest solution to a problem: simply dump the problem on the controller and make him work around it; don’t bother to actually fix the problem itself.

ERAM is likely going to have a bunch of workarounds for some time to come.  There are way too many bugs to avoid that, considering that the FAA clearly isn’t going to let bugs get in the way of them deploying ERAM anyway.

Here are a few of the items on the ZLC Mitigation List with the workarounds (I have highlighted what I consider the most significant problems):

  1. An interim altitude prevents an auto handoff to a TRACON facility – manual handoff required
  2. Mode C altitude may not replace a reported altitude – manually enter altitude
  3. PVD data blocks sent to adjacent facilities and that are redirected to another sector will not display – manually coordinate
  4. Intra-facility (in-house) interim altitudes can be changed without track control (after handoff) without override (/OK)
  5. Interim altitudes cannot be entered on your own data blocks – make override entry (/OK) to get data block control, or handoff the data block to adjacent sector, then have them hand back to you to get data block control
  6. Under certain circumstances handoffs to adjacent facilities will not handoff to the proper sector – manually direct handoff to proper sector
  7. Data blocks may be missing
  8. Failures of display when invoking preference settings when the display has empty constant range readout (CRR) groups
  9. Display failures when modifying existing annotations made using draw function – create new drawings instead of modifying existing one
  10. Adjacent sector boundaries are not displayed

The ZMP ERAM Problem Report (PR) list includes the following notables:

  1. PAR (Preferential Arrival Routes) don’t apply properly
  2. Data blocks way behind (70-170 miles) the route of flight
  3. Random Unsuccessful Transmission Messages (UTMs) to previous facility or facilities near flight plan route
  4. Bad Route from template or trial plan
  5. UTMs on flight plans transitioning from ZMP to Chicago Center (ZAU) and then back to ZMP, ZAU unable to handoff data block
  6. Auto-delete drops data block and remove strip (RS) on flight plan while flight still in sector
  7. Incorrect route processing
  8. Flight plan route off by 30 miles laterally resulting in dual track control between ZMP and Kansas City Center (ZKC)
  9. Data block auto-delete when trying to take track control
  10. Inconsistent track control assignment
  11. Delay in remove strip (RS) processing – can take 39 minutes
  12. General Information (GI) message reprints 10 days later

(That’s not even mentioning the memory leaks that slow down parts of the system so that the machines need to be rebooted daily, and the other underlying system problems that they won’t tell controllers about.)

Some of those are major problems, that should preclude ERAM being used on live traffic at its current stage of development (check out #6 on the second list!).

The list of problems and the fact that they’re telling controllers not to use some functions isn’t exactly giving them lots of confidence in the tool they are supposed to use to keep airplanes apart either.  At this point it’s possible that any combination of computer entries still has the potential to crash ERAM, and all the controllers who’ve used it on real airplanes know it, so they’re tiptoeing around it.

It’s still a ticking time bomb.

ERAM is a undoubtedly a very complicated project that’s going to take some time to fully develop and debug.

But doing that debugging on live traffic is foolhardy and irresponsible.

The FAA’s own website contains the following statements:

Safety: The Foundation of Everything We Do

and the FAA’s alleged values:

Safety is our passion. We are the world leaders in aerospace safety.

Quality is our trademark. We serve our country, our stakeholders, our customers, and each other.

Integrity is our character. We do the right thing, even when no one is looking.

People are our strength. We treat people as we want to be treated.

Note to the FAA:  there’s a big difference between saying something (or writing it down), and actually doing it.

One comment

  1. It”s like Fatty Patty said when the OMA airspace got re-configured and there were no Dysim problems to help acclimate Areas 5 and 6 to the changes…”..they’re controllers…they’ll figure it out”…

Leave a Reply

Your email address will not be published. Required fields are marked *