I’ve Been ERAMmed

In the event people are visiting to see more updates on En Route Automation Modernization (ERAM), I’ll try and make this short.

ERAM, the replacement computer/display system for enroute center controllers, is still experiencing problems and the FAA continues to use it on live air traffic. In other words, not much has changed.

The most critical ERAM bugs that I’m aware of (outside of outright display failures/lockups – big red “X”s) involve data block/tracking issues, some of which haven’t been corrected even though they’ve been known of for some time. The absolute worst tracking bug is when a data block drops off a target and the accompanying flight plan is simultaneously deleted. Other critical tracking bugs involve data blocks tracking on the wrong target.

One thing that has changed since I last wrote about ERAM is that I’ve since had the “opportunity” to work with it on live traffic. Needless to say the experience didn’t change my opinion about the program.

We were told in advance to not “experiment” with ERAM while using it (presumably because of the myriad problems we might encounter). Apparently it’s true that we’re really not testing the software any more right now; we’re just using it, while trying to not use it so much that we break it. (Maybe at this juncture the FAA figures there are enough known bugs already and they don’t want to find more.)

Many of those involved with the ERAM project continue to believe and/or give outward appearances that things are going fine in spite of the persistent problems. In fact, controllers are hearing less and less of the bugs in the software as time goes by, which has the added effect of making it appear that things are going better than they really are.

I guess it’s the old, “No news is good news” theory…

During the Initial Operational Capability (IOC) or first run on live traffic at Minneapolis Center (ZMP) they provided controllers with a list of known major bugs in ERAM, but they’re now no longer providing that information before live runs. (Perhaps the list is too long…) Either way it’s obvious that the FAA doesn’t think controllers need to know about the deficiencies of the system they’re supposed to use to keep airplanes separated (or doesn’t want to advertise them anyway).

ZMP had two operational runs with ERAM the first week of March and not surprisingly we experienced some repeats of some of the most significant ERAM bugs.

Mind you, most of the bugs we experienced weren’t new bugs; they were existing bugs they already knew about.

Instead of fixing the known bugs before running ERAM on more live traffic, the FAA continues to run software they know is faulty more often.

“Idealists” like me would like to believe that the FAA and Lockheed Martin would want to fix those known bugs once they were discovered before running the same versions on more live traffic.

But that’s not how the ERAM program is progressing at all. That’s because the FAA and Lockheed Martin have agendas that don’t prioritize the safety of the flying public.

First, fixing all the bugs before running it more would take more time and cause the program to fall further behind schedule.

Another reason the FAA is now running software builds with known bugs on live traffic at the various key sites is to avoid refresher training for controllers as per the Memorandum of Understanding (MOU) between the FAA and the air traffic controllers’ union, NATCA.

The MOU says that:

“Basic and supplemental ERAM training shall be provided to all BUEs prior to implementation of ERAM at each facility. If more than forty-five (45) days elapses between the time BUEs complete ERAM training and the actual implementation of ERAM at the facility, ERAM refresher training shall be provided prior to implementation.”

Apparently IOC equates to “implementation” in the MOU (although I don’t see that definition in the MOU).

So there you have it: the FAA is now rushing ERAM into use at least in part so that it doesn’t have to re-train its controller workforce per the MOU.

Now that the Winter Olympics are over, Seattle Center (ZSE) is now also back in the ERAM game as a key site again. ZSE just ran an extended live run less than a week ago that predictably also had its share of problems, simply because they were running the same software build everyone knew had existing bugs.

In spite of all the problems, the FAA continues to press towards an In Service Decision (ISD) after which the non-key sites (the rest of the enroute centers) can move towards their own IOCs, in spite of the fact that the longest period any of the key sites have run ERAM is for 8 days (and that under very controlled conditions).

So we’ve established that the FAA isn’t fixing significant/critical known bugs before running the ERAM software more often. They’re telling controllers to tip-toe around ERAM while using it to create the illusion that things are going well, in spite of the many known bugs. They’re withholding more and more information as time goes by and the project continues to go badly.

It’s obvious that the FAA is sticking with their approach that prioritizes the deployment schedule (which for the record continues to slip further and further behind), saving money on training, and pretty much anything else, over safety.

It’s just business as usual for the FAA…

5 comments

George says:

March 14, 2010 at 10:48 am

It really is a shame that a program that cost so much is a hindrance instead of a benefit to the controllers. I have spoken to controller friends at our site and told them to bang the heck out of ERAM like we did with Host. If it’s going to fail, better now then when your fully operational. My understanding from conversation with folks in DC and the Tech Center once ERAM goes ORD we basically have bought it. Lockheed then charges us to fix problems we identify after ORD. Lets uncover as many now otherwise there wouldn’t be enough money in the Op’s budget to fix them after ORD.

zack says:

March 15, 2010 at 12:40 pm

You need to test the hell out of it, stress it and document everything! Copy all your findings to the union, do NASA forms, ATSAT, whatever but don’t let LM off the hook, they need to fix it BEFORE you get stuck with it forever.

Jimmy says:

March 20, 2010 at 8:07 am

I hear ZLC had to fall back to host last night. Any ideas why?

The ATC Freq says:

March 21, 2010 at 3:31 pm

Apparently ZLC was running one of the “T” version software builds (T9?), and had FDM (flight data management i.e. flight plan processing system) problems again.

Not surprisingly, the FDM issues were a large part of the reason ZLC fell back to HOST about a month ago as I wrote about here. At that time they were also running a “T” version build (T7 I believe).

It’s still a big mystery about why the FAA and Lockheed Martin insist on running versions they know have major bugs and then apparently being surprised when those bugs reappear…

George says:

March 21, 2010 at 7:56 pm

It’s not surprising that FDM caused ZLC to fall back. Under Host it’s an complex piece of code and instead of migrating the software to the new platform LMCO decided to rewrite it. The current Host versions represents fixes that have happened over the last 20 year. When you sign up to rewrite a piece of software of its complexity expect years of debugging even with no functional improvements to it.

ATC Freq you are right to be mystified about how a build which you know has problems will somehow self heal at ZLC when we saw the same issues during testing at the Tech Center.

Only logical answer can be management hoped the controllers would accept it this time, plus they are on the hook to delivered a new build by a certain date, so you get T7 and I bet on time, problem is it didn’t work.

Leave a Reply Cancel reply