2:15-CV-05811 Response To MSJ

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 1 of 23 Page ID #:167
1
2
Attorneys at Law
814 W. Roosevelt
Phoenix, Arizona 85007
(602) 258-1000 Fax (602) 523-9000
4
5
6
7
Michael W. Pearson, AZ SBN 016281

[email protected]
[email protected]
Admitted Pro Hac Vice
Attorneys for Plaintiff
IN THE UNITED STATES DISTRICT COURT
FOR THE CENTRAL DISTRICT OF CALIFORNIA
10
WESTERN DIVISION
11
Jorge Alejandro Rojas,
12
Plaintiff,
13
vs.
14
Federal Aviation Administration,

15
Defendant.
16
Case No. CV15-5811-CBM (SSx)

PLAINTIFFS RESPONSE TO
DEFENDANTS MOTION FOR
SUMMARY JUDGMENT
Hearing
Date: May 10, 2016
Time: 10 a.m.
(Before the Honorable Consuelo B.
Marshall)
17
18
19
1.
TABLE OF CONTENTS
21
2.
TABLE OF AUTHORITIES
22
3.
PLAINTIFFS RESPONSE TO DEFENDANTS MOTION FOR SUMMARY

JUDGMENT AND MEMORANDUM OF POINTS AND AUTHORITIES IN
SUPPORT OF SAME
20
23
24
25
26
27
///
TABLE OF CONTENTS
2
3
4
5
6
7
8
9
11
12
814 W. Roosevelt Street
Curry, Pearson & Wooten, PLC
10
13
14
15
16
17
18
I. INTRODUCTION ................................................................................................ 5
II. PROCEDURAL HISTORY ................................................................................. 8
III. LEGAL ARGUMENT .......................................................................................... 9
B. ASSUMING NO DISPUTE OF MATERIAL FACTS, THE FAA IS NOT ENTITLED TO
SUMMARY JUDGMENT AS A MATTER OF LAW ........................................................... 12
1. The validation study and summary show no merit of being privileged ......... 12
(a) The validation study and the summary do not meet the elements of
privilege ...................................................................................................... 12
(i) The Validation Study and the Summary merely reveal facts, which
are not protected under privilege ........................................................... 13
(ii) There is a lack of litigation needed for the FAA to anticipate in
relation to the study and the summary .................................................... 14
(iii) The Study and the Summary were not prepared in anticipation of
litigation .................................................................................................. 15
(b) Substantial need/undue hardship and balancing of interests overcome
privilege ...................................................................................................... 18
(c) Even assuming the study and summary are covered by privilege, the
FAA waived that privilege ......................................................................... 20
2. The Validation Study and Summary Are Not Privileged ............................... 22
(a) APT Metrics is not an attorney capable of providing legal advice ..... 22
IV. CONCLUSION ................................................................................................... 22
19
20
21
22
23
24
25
26
27
TABLE OF AUTHORITIES
CASES
Bairnco Corp. Sec. Litig. V. Keene Corp., 148 F.R.D. 91 (S.D.N.Y. 1993) ............... 16
California Sportfishing Protection Alliance v. Chico Scrap Metal, Inc., 299 F.R.D.
638 (E.D. Cal. 2014) ................................................................................................ 13
Coastal Corp. v. Duncan, 86 F.R.D. 514 (D. Del. 1980) ............................................ 19
Columbia Pictures Television, Inc. v. Krypton Broadcasting of Birmingham, 259 F.3d
1186 (9th Cir. 2001) ................................................................................................. 21
Exxon Corp. v. FTC, 466 F. Supp. 1088 (D.D.C. 1978), aff'd, 663 F.2d 120 (D.C. Cir.
1980) ......................................................................................................................... 13
Garcia v. City of El Centro, 214 F.R.D. 587 (S.D. Cal. 2003) ................................... 13
Hamdan v. U.S. Dept. of Justice, 797 F.3d 759 (9th Cir. 2015) ................................. 11
Harper v. Auto-Owners Ins. Co., 138 F.R.D. 655 (S.D. Ind. 1991)............................ 15
Hickman v. Taylor, 329 U.S. 495 (1947)..................................................................... 19
In re Grand Jury Investigation, 599 F.2d 1224 (3rd Cir. 1979).................................. 15
In re Grand Jury Subpoena (Mark Torf/Torf Envtl Mgmt), 357 F.3d 900 (9th Cir.
2003) ................................................................................................................... 12, 16
In re Green Grand Jury Proceedings, 492 F3d. 976 (8th Cir. 2007).......................... 12
In re Jury Subpoenas, 318 F.3d 379 (2nd Cir. 2003) .................................................. 18
Kintera, Inc. v. Convio, Inc, 219 F.R.D. 503 (S.D. Cal. 2003) ................................... 21
Moody v. I.R.S., 654 F.2d 795 (D.C. Cir. 1981) .......................................................... 19
Nat'l Council of La Raza v. DOJ, 411 F.3d 350 (2d Cir. 2005) .................................. 21
Parrot v. Wilson, 707 F.2d 1262 (11th Cir. 1983) ...................................................... 19
Ramsey v. NYP Holdings, Inc., 2002 U.S. Dist. LEXIS 11728 (S.D.N.Y. 2002) ....... 13
S. Union Co. v. Southwest Gas Corp., 205 F.R.D. 542 (D. Ariz. 2002) ..................... 12
Tayler v. Travelers Ins. Co., 183 F.R.D. 67 (N.D.N.Y. 1998) .................................... 17
Texas Puerto Rico, Inc. v. Department of Consumer Affairs, 60 F.3d 867 (1st Cir.
1995) ......................................................................................................................... 19
U.S. Department of State v. Ray, 502 U.S. 164 (1991) ................................................. 5
U.S. v. Christensen, 801 F.3d 970 (9th Cir. 2015) ...................................................... 19
U.S. v. Fort, 472 F.3d. 1106 (9th Cir. 2007) ............................................................... 18
U.S. v. Nobles, 422 U.S. 225 (1975) ............................................................................ 19
U.S. v. Richey, 632 F.3d 559 (9th Cir. 2011) .................................................. 12, 15, 16
U.S. v. Textron Inc. and Subsidiaries, 577 F.3d 21 (1st Cir. 2009) ............................ 18
United States v. Aldman, 68 F.3d 1495 (2d Cir. 1995) ......................................... 12, 16
Upjohn Co. v. U.S., 449 U.S. 383 (1981) .................................................................... 19
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
Verizon California Inc. v. Ronald A. Katz Technology Licensing, L.P., 266 F.Supp.2d
1144 (C.D. Cal. 2003) .............................................................................................. 20
Yurick v. Liberty Mut. Ins. Co., 201 F.R.D. 465 (D. Ariz. 2001) ................................ 17
Zemansky v. EPA, 767 F.2d 569 (9th Cir. 1985) ......................................................... 11
STATUTES
41 CFR 60-3.5 ............................................................................................................ 6
41 CFR 60-3.7 ............................................................................................................ 6
42 U.S.C. 2000e-2(h) ............................................................................................ 6, 17
RULES
Fed. R. Civ. P. 26(b)(3) ............................................................................................... 19
Fed. R. Civ. P. 56(a) ...................................................................................................... 9
REGULATIONS
29 CFR 1607.1 .......................................................................................................... 17
29 CFR 1607.15 .......................................................................................................... 6
29 CFR 1607.4(D) .................................................................................................... 17
OTHER AUTHORITIES
Blacks Law Dictionary, http://thelawdictionary.org/validation ................................... 6
http://www.siop.org/workplace/employment%20testing/information_to_consider_wh
en_cre. aspx ................................................................................................................ 6
https://www.opm.gov/policy-data-oversight/assessment-and-selection/otherassessment-methods/biographical-data-biodata-tests/ ............................................... 6
Merriam-Webster.com. Merriam-Webster, n.d. Web. 25 Apr. 2016 ............................ 6
Restatement (Third) of the Law Governing Lawyers 87 cmt. g (2000) ................... 14
Restatement (Third) of the Law Governing Lawyers 87(1) (2000) ......................... 13
20
21
22
23
24
25
26
27
1
2
3
4
Plaintiff Jorge Alejandro Rojas, through undersigned counsel, respectfully

opposes Defendants Motion for Summary Judgment. Plaintiff submits the
accompanying Controverting Statement of Facts and Separate Statement of Facts,
Affidavit and Exhibits in support of Plaintiffs Motion.
MEMORANDULM OF POINTS AND AUTHORITIES
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
I.
INTRODUCTION
The Freedom of Information Act (FOIA) was enacted to pierce the veil of
administrative secrecy and open Agency action to the light of public scrutiny. U.S.
Department of State v. Ray, 502 U.S. 164, 173 (1991). Since 2013, over 3,000
individuals have been negatively impacted by Defendant Federal Aviation
Administrations (FAA) changes to the Air Traffic Control Specialist (ATCS) hiring
process. Plaintiffs Statement of Facts (PSOF) 6-7. The FAA significantly reduced
the requirements for this safety sensitive and skill intensive position. Id. 8. Part of the
new hiring process included purging an employment referral list of approximately
2,000-3,000 qualified candidates. Id. 7. These candidates graduated from FAA
sanctioned Air Traffic Collegiate Training Institutions (CTI) and passed the FAAs
previously extensively validated air traffic control aptitude examination (AT-SAT). Id.
8, 9. Spokesman Tony Molinaro said the decision was made to add diversity to the
workforce. Id. 10. The FAA stated in various notifications to impacted individuals,
that a Biographical Questionnaire would be used for the new hiring process. Id. 11,
13. This included statements from John Scott, Chief Operating Officer of APT Metrics.
Id. 12. The FAAs new hiring process included a new exam that was taken online
from the applicants home called the Biographical Assessment (BA). Defendant
Federal Aviation Administrations Statement of Facts (DSOF) 12.
26
27
1
2
3
4
5
6
7
8
9
The subject of this action is the disclosure of the 2015 validation study and any
related summaries, required to be completed pursuant to statute 1 and regulation 2 .
Validation is defined as to recognize, establish, or illustrate the worthiness or
legitimacy of . 3 The Society for Industrial and Organizational Psychology, Inc.
(SIOP) is cited on the United States Office of Personal Management (OPM) website
regarding bio-data testing such as the BA.4 According to the SIOP, [e]xperienced
and knowledgeable test publishers have (and are happy to provide) information on the
validity of their testing products.5 Plaintiff is simply requesting what experienced
and knowledgeable test publishers are usually happy to provide.
11
12
10
13
14
15
16
17
18
19
20
21
This case is about the FAAs continued lack of institutional veracity and repeated
improper attempts at withholding documents that are clearly subject to release and
review. The FAA has failed to be upfront about the rationale or methodology of the
new screening and testing process. Therefore, Plaintiff is utilizing the FOIA process to
serve the public interest by sharing records concerning the changes with those impacted
by the action. Those impacted by the FAA changing the standards for hiring ATCS are
not a small subset of society anyone who flies is adversely impacted by the
degradation of the national airspace system at the hands of those entrusted to ensure
safety. FAA Spokesman Mr. Molinaro stated that the purge of the list of eligible
candidates was done to add diversity to the workforce. PSOF 10. Piercing the veil
of administrative secrecy and opening up the FAAs actions to the light of public
scrutiny is particularly necessary in this case to ensure public safety. Revealing the
22
23
24
Including 42 U.S.C. 2000e-2(h)

Including 29 CFR 1607.15; 41 CFR 60-3.5; and 41 CFR 60-3.7.
3
Merriam-Webster.com. Merriam-Webster, n.d. Web. 25 Apr. 2016; see also, Blacks Law
Dictionary, http://thelawdictionary.org/validation
4
https://www.opm.gov/policy-data-oversight/assessment-and-selection/other-assessmentmethods/biographical-data-biodata-tests/
5
http://www.siop.org/workplace/employment%20testing/information_to_consider_when_cre. aspx
2
25
26
27
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
FAAs intentional compromising of safety due to political correctness is the type of

action the FOIA law was meant to reveal.
Plaintiff was forced to file the underlying suit due to the FAAs well practiced
game of administrative thimblerig and continual delay. Even after the instant action
was commenced, the continual delays and back-pedaling continued. In summary, the
FAA changed its hiring practices in 2013 and began using a new examination, which it
referred to as the Biographical Questionnaire (BQ) when announcing the changes to
the hiring process, for hiring ATCS for the 2014 vacancy announcement. PSOF 69, 11-14. In 2015, the agency used a different examination, with a different question
set, than the 2014 examination. PSOF 14-15. The 2015 examination was called the
Biographical Assessment (BA). Id.
The FAA has constantly alleged to have performed a validation study for both
the 2015 and 2014 examinations. DSOF 4. PSOF 9, 13-14, 16. Despite admitting
a validation of the 2014 exam, without anticipation of litigation, the FAA now asserts
litigation as a reason for withholding the 2015 exam validation. Just recently, after
intense congressional pressure, Administrator Huerta admitted to Congress that the
2014 validation of the test was not validated until the end of 2014, which is months
after the FAA claimed it was originally validated and was after the FAA had already
hired individuals using that exam. Id. 17. Despite this, the FAA nevertheless stated to
approximately 91% of applicants, or 26,104 individuals, that it performed the validation
study on time for 2014. Id. 18. In 2015, the agency denied approximately 73% of
applicants, or 13,219 individuals, under the claim that the biographical test ruled they
werent suitable for the position. Id. 19. The FAAs claim that the validation
documents were prepared by APT Metrics because of anticipated litigation is false. The
FAA was required by statute to perform a validation study for the new examination. In
fact, performing validation studies is a course of normal agency business and therefore
27
1
2
not subject to exemption 5. Additionally, based on Defendants Vaughn index, it is clear

that the FAA has yet to perform an adequate search for responsive records.
3
4
II.
Biographical Assessment Validation Study (6130). On or about May 20, 2015,
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
PROCEDURAL HISTORY
Plaintiff requested records concerning the validation study for the 2015 Biographical
Assessment (BA). PSOF 20. The request was assigned to multiple organizations
within the FAA. Id. The subject of the instant action is the response from the FAAs
Office of the Chief Counsel (AGC). On June 18, 2015, the AGC responded with a
FOIA exemption 5 claim deliberative process, and attorney-client privilege. DSOF
16.
On June 25, 2015, Plaintiff submitted a FOIA appeal concerning AGCs
response. Id. 17. Plaintiff alleged that the documents were not protected by the
Attorney-Client or deliberative process privilege. See DSOF Exhibit B.
Plaintiff received no reply from the FAA within the statutory twenty-day period.
Therefore, on July 31, 2015, Plaintiff filed the underlying action. (Dkt. # 1).
During conversations between Plaintiff and FAA Counsel, it was made clear that
the subject of this action was the validation study proving that the administration and
use of the 2015 Biographical Assessment (BA) was valid. PSOF 21. In other words,
Plaintiff seeks proof that the BA measures characteristics related to the field for which
the test was allegedly designed for.
The FAA remanded the FOIA request for processing on October 7, 2015. DSOF
18. The FAA, through counsel, indicated by telephone that in response to Plaintiffs
FOIA request, the FAA had reviewed the wrong year of records. FAA Counsel later
emailed Plaintiff such was the case. PSOF 22. Plaintiff alleges that this is an attempt
by the FAA to further stall and block access to Agency records, as Plaintiffs initial
FOIA request was very clear as to what records were sought.
1
2
3
4
5
On December 10, 2015, the FAA finally provided a revised response to the
FOIA request. DSOF 20. This time, the FAA dropped its pre-decisional claim and
instead invoked Attorney-Client and Attorney-Work Product Privilege. Id. The FAAs
removal of the pre-decisional claim is further evidence of the FAAs consistent willful
violations of FOIA and attempts to shield Agency documents from disclosure.
6
7
8
9
11
process, a private contractor, APT Metrics, was contracted by the Agency to perform
the validation study. Id. 3-4, 9-11. Plaintiff maintains that the FAA was required by
statute to perform such a validation study and that even with the potential threat of
litigation, the validation study would have been conducted in the course of regular
agency business.
12
10
The FAA states that in anticipation of litigation concerning the ATCS hiring
13
14
15
16
17
18
19
The FAAs assertion that it performed the validation study following the filing
of EEO complaints as a result of the 2014 hiring session are false, as shown by the
FAAs failure to address anticipation of litigation during the 2014 announcement, yet
it admits that it was validated. Id. 3-4. As a result of the FAAs requirement to
perform a validation study, the validation performed by APT Metrics is a course of
normal agency business, and therefore the validation study is not subject to Exemption
5. Furthermore, the Vaughn Index provided by Defendant demonstrate that an adequate
search for responsive records has yet to be performed.
20
21
22
23
24
25
26
III.
LEGAL ARGUMENT
A.
Dispute of Material Facts and Inadequate Search
The FAA is not entitled to summary judgment because there is a genuine dispute
of material fact regarding whether the FAA conducted an adequate search. The court
shall only grant summary judgment if the movant shows that there is no genuine
dispute as to any material fact. Fed. R. Civ. P. 56(a).
27
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
The FAA relies on work product privilege in withholding the validation study.
(Defs Mot. for Summ. J. at 11-12, April 4, 2016). In support of this, the FAA claims
that the validation study came about as a result of anticipated litigation when it
requested the study following the filing of an EEO complaint against them. Id.
However, the facts tell a different story. Under the former air traffic aptitude test (ATSAT) the FAA also conducted validation studies. PSOF 9. Exhibit 10 to PSOF. The
FAA Administrator admitted that it hired APT in 2013 and that APTs work was to
last 2 years, concluding at the end of 2014. Letter from Michael Huerta, Administrator,
Fed. Aviation Admin., to Kelly Ayotte, Chair, Subcomm. on Aviation Operations,
Safety, and Sec. U.S. Senate at 1 (Dec. 8, 2015). Exhibit 1 to PSOF. This shows that
APT Metrics was already conducting these validations before the EEO filings. Even
now, the FAA is continuing with the usual practice of conducting validation studies on
their tests for the 2016 year. PSOF 23. Mem. from Teri Bristol, Chief Operating
Officer, Air Traffic Org., to Distribution, Fed. Aviation Admin. at 1 (Feb. 11, 2016).
As Officer Bristol writes in the 2016 memoranda, [t]he FAA is evaluating potential
replacements for the AT-SAT . . . . We are asking randomly selected CPCs . . . to help
us evaluate their effectiveness as a future selection tool. Exhibit 15 to PSOF. Nowhere
in that memorandum does it mention words like litigation or adversarial
proceedings.
In a 2015 letter to Congress, the FAA Administrator claimed that the FAA
maintains the safest and most efficient aerospace system in the world partly because
we continuously evaluate and strengthen our ATCS hiring and training processes.
Exhibit 1 to PSOF at 2. The Administrator then states that the changes made in 2014
and 2015 were to further that commitment. Id. Given their public proclamation of
conducting a validation of the 2014 and 2015 tests, the FAAs history of validation
studies and the fact that they did these studies before the EEO complaint even arose,
27
10
1
2
there is a genuine dispute of material fact as to whether the FAA really did request the
study as a result of the EEO complaint being filed.
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
Furthermore, the FAA did not conduct an adequate search and should not be
granted summary judgment. FOIA requires an agency responding to a request to
demonstrate that it has conducted a search reasonably calculated to uncover all
relevant documents. Hamdan v. U.S. Dept. of Justice, 797 F.3d 759, 770 (9th Cir.
2015) (quoting Zemansky v. EPA, 767 F.2d 569, 571 (9th Cir. 1985)). FAA Counsel,
Alarice Medrano, advised Plaintiff that the wrong years of records were reviewed
responsive to Plaintiffs request. PSOF 22. Along with this, it is questionable whether
the FAA uncovered all the documents regarding the validation study. Former validation
studies done by the FAA have been well over 100 pages and consisted of multiple
volumes. Id. 9. Defendants Vaughn Index shows the withheld validation documents
being 9 pages in length. This is drastically shorter than those previously released. This
alludes that the FAA may not be fully forthcoming about this matter. This is further
shown by the FAA Administrator admitting to Congress that it did not even do the 2014
validation study until after the hiring took place, contrary to what they had said
previously. Id. 17. Given that the FAA is not being entirely upfront on this matter,
that they searched during the wrong time frame, and that there are inconsistencies with
the validation studies, Plaintiff has valid and reasonable concerns regarding whether
the FAA has conducted a search reasonably calculated to find all the requested
materials. As such, Defendants Motion for Summary Judgment should be denied.
22
23
24
25
26
27
///
11
1
2
3
B.
Assuming No Dispute of Material Facts, the FAA is Not Entitled to

Summary Judgment as a Matter of Law
1.
(a)
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
The validation study and summary show no merit of being

privileged
The validation study and the summary do not meet the
elements of privilege
Neither the validation study nor the summary of it is protected by work-product

privilege. The burden of establishing protection of materials as work product is on the
proponent, and it must be specifically raised and demonstrated rather than asserted in a
blanket fashion. S. Union Co. v. Southwest Gas Corp., 205 F.R.D. 542, 549 (D. Ariz.
2002). To qualify for work-product protection, documents must: (1) be prepared in
anticipation of litigation for trial and (2) be prepared by or for another party or by or
for that other partys representative. U.S. v. Richey, 632 F.3d 559, 567 (9th Cir. 2011)
(quoting In re Grand Jury Subpoena (Mark Torf/Torf Envtl Mgmt), 357 F.3d 900, 907
(9th Cir. 2003)).
The FAA states that it had several conversations with John Scott and asked him
to summarize elements of his validation work related to the use of the BA as an
instrument in the ATCS selection process DOSF 10. Furthermore, the FAA states
that APT Metrics provided FAA counsel with an initial summary of the validation
work. APT Metrics supplemented this information in January 2015. Id. 11. The
purpose of the work-product doctrine is to protect an attorneys mental processes so
that the attorney can analyze and prepare for the clients case without interference from
an opponent. United States v. Aldman, 68 F.3d 1495, 1501 (2d Cir. 1995). APT Metrics
admittedly developed the BA/BQ tests at issue and wrote the summaries. APT Metrics
is not the FAAs client. The FAA is Agency Counsels only client.
Attorneys and clients are holders of work product protection. See, e.g., In re
Green Grand Jury Proceedings, 492 F3d. 976, 980 (8th Cir. 2007). While it is settled
that non-attorneys such as retained experts and consultants may author documents
12
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
constituting work-product, so long as they act under the general direction of attorneys.
See, e.g., Exxon Corp. v. FTC, 466 F. Supp. 1088, 1099 (D.D.C. 1978), aff'd, 663 F.2d
120 (D.C. Cir. 1980). APT Metrics is not a retained expert because of litigation. APT
Metrics designed the BA/BQ tests, and allegedly validated the same, for testing
purposes not in anticipation of litigation. APT Metrics could not properly act as an
independent expert or consultant if the quality of its products were at issue. APT is at
best a non-party witness to this FOIA matter. It is improper to invoke work-product
privilege for a non-party witness to preclude production of materials prepared by of for
that witness even if the materials were created in contemplation of the witnesss own
pending or anticipated litigation. Ramsey v. NYP Holdings, Inc., 2002 U.S. Dist. LEXIS
11728, at *18-*19 (S.D.N.Y. 2002). The second element is not at issue here. Because
both documents reveal only facts, because there was no litigation to be anticipated at
the time of creation, and because the documents were not prepared in anticipation of
litigation, the first element is not met. Therefore, neither type of document is protected
under work-product privilege.
16
17
(i)
The Validation Study and the Summary merely reveal facts, which are
not protected under privilege
18
Both the validation study and the summary only provide facts and, as a result,
19
are not protected by work-product privilege. The work-product doctrine does not
20
protect the underlying facts. Restatement (Third) of the Law Governing Lawyers
21
87(1) (2000). [B]ecause the work product doctrine is intended only to guard against
22
the divulging of attorneys strategies and legal impressions, it does not protect facts
23
concerning the creation of work product or facts contained within the work product.
24
California Sportfishing Protection Alliance v. Chico Scrap Metal, Inc., 299 F.R.D. 638,
25
643 (E.D. Cal. 2014) (quoting Garcia v. City of El Centro, 214 F.R.D. 587, 591 (S.D.
26
Cal. 2003)). Immunity does not attach merely because the underlying fact was
27
13
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
discovered through a lawyers effort or is recorded only in otherwise protected work

product. . . . Restatement (Third) of the Law Governing Lawyers 87 cmt. g (2000)
For the example used in the Restatement regarding a lawyers file
memorandum[,] [i]mmunity does not apply to an interrogatory seeking names of
witnesses to the occurrence in question or whether a witness recounts a particular
version of events, for example that a traffic light was red or green. Id.
Similarly here, the FAA hired APT Metrics to conduct a validation study that
would determine a particular outcome, that is, would the 2015 test be valid for use in
hiring employees? Unless the FAA discloses the test itself, the validation study is the
only source determinative of whether the test was discriminatory or not. The study has
such weight according to the FAA, that the Agency uses it as a reason to explain why
denied applicants were not accepted. PSOF 11, 13-14, 16, 26. If it is an argue-point
that the FAA is going to continually rely on, then that fact should be available to the
public. There is no creeping into an attorneys strategies or legal impressions,
especially when APT Metrics is not in the business of giving legal advice. Id. 24,
25. Given that the validation study and the summary are but underlying facts, workproduct protection does not apply.
18
19
(ii)
There is a lack of litigation needed for the FAA to anticipate in relation

to the study and the summary
20
There was no litigation that could have been anticipated in relation to the
21
validation study or its summary. Litigation includes civil and criminal trial
22
proceedings, as well as adversarial proceedings before an administrative agency, an
23
arbitration panel or a claims commission, and alternative-dispute-resolution
24
proceedings such as mediation or mini-trial. Restatement (Third) of the Law
25
Governing Lawyers 87 cmt. h (2000). In short, an adversarial rulemaking proceeding
26
is litigation for purposes of the immunity. Id. The litigation in question though cannot
27
be some vague suspicion that litigation might come from a situation. Because litigation
14
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
can, in a sense, be foreseen from the time of occurrence of almost any incident, courts
have interpreted the Rule to require a higher level of anticipation in order to give a
reasonable scope to the immunity. Harper v. Auto-Owners Ins. Co., 138 F.R.D. 655,
659 (S.D. Ind. 1991). Courts have ranged from emphasizing litigation being real and
imminent to litigation being identifiable or reasonable. In re Grand Jury
Investigation, 599 F.2d 1224, 1229 (3rd Cir. 1979).
In this case, we are dealing with a validation study meant to ensure the quality
of the test used to fill ATCS positions. As already shown, this is not the first time the
FAA has conducted a validation study and today, it continues to conduct them. PSOF
9, 23. Furthermore, APT Metrics website highlights the importance of disclosing
validation studies and ensuring a transparent hiring system. Id. 25. Again with the
burden falling on the FAA, it is up to the FAA to show how this particular validation
study was somehow not only prepared for the real possibility of litigation but also
litigation as contemplated by the EEO complaint, which it references. The mere fact
that a complaint is filed does not convert documents that were regularly created in the
past as falling under work-product privilege. Because there is no connection made as
to the litigation in relation to the EEO complaint and the validation studies, the workproduct privilege does not apply.
19
20
(iii)
The Study and the Summary were not prepared in anticipation of

litigation
21
The FAA did not prepare the validation study or the summary in anticipation of
22
litigation for work-product purposes. Both documents were required by law and are a
23
part of regular Agency business. Even assuming it was tied to some possibility to
24
litigation, [i]n circumstances where a document serves a dual purpose, that is, where
25
it was not prepared exclusively for litigation, then the because of test is used. U.S.
26
v. Richey, 632 F.3d 559, 567-568 (9th Cir. 2011). This because of test consider[s]
27
the totality of the circumstances and affords protection when it can fairly be said that
15
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
the document was created because of anticipated litigation, and would not have been
created in substantially similar form but for the prospect of that litigation[.]. In re
Grand Jury Subpoena (Mark Torf/Torf Envtl Mgmt), 357 F.3d 900, 908 (9th Cir. 2003)
(quoting United States v. Adlman, 134 F.3d 1194, 1195 (2nd Cir. 1998)). Therefore,
even if the documents were prepared in anticipation of litigation, the materials are not
work-product if they would have been prepared irrespective of the prospect of
litigation. Bairnco Corp. Sec. Litig. V. Keene Corp., 148 F.R.D. 91, 103 (S.D.N.Y.
1993).
The Ninth Circuit case U.S. v. Richey is greatly on point here. In that case, the
appellees retained a law firm for legal advice concerning a conservation easement. U.S.
v. Richey, 632 F.3d 559, 562 (9th Cir. 2011). That law firm retained an appraiser to
provide valuation services and advice with respect to the conservation easement. Id.
As a result, the appraiser prepared an appraisal report to be filed with the Taxpayers
2002 federal income tax return . . . . Id. The Ninth Circuit found that the appraisal
work file could not be said to have been prepared in anticipation of litigation. Richey,
at 568. Despite being related to the law firms representation, the Ninth Circuit
emphasized the fact that the appraisal report [was] . . . required by law. Id. Had the
IRS never sought to examine the Taxpayers 2003 and 2004 federal income tax returns,
the Taxpayers would still have been required to attach the appraisal to their 2002 federal
income tax return. Nor is there evidence in the record that [the appraiser] would have
prepared the appraisal work file differently in the absence of prospective litigation. Id.
Like in Richey, this case involves a partys law firm contracting with another
entity to create a document that assesses certain facts within its area of expertise. Like
in Richey, the FAA was required to conduct the validation study by law.
Title VII of the Civil Rights Act of 1964 prohibits the use of discriminatory tests
and selection procedures (Title VII). Title VII permits the use of employment tests
so long as they are not designed, intended or used to discriminate because of race,
16
1
2
3
color, religion, sex or national origin. 42 U.S.C 2000e-2(h). The Federal government
has issued regulations to meet the needs set by Title VII. Specifically, 29 CFR 1607.1
states in part:
They are designed to provide a framework for determining the proper use of tests
and other selection procedures. These guidelines do not require a user to conduct
validity studies of selection procedures where no adverse impact results.
However, all users are encouraged to use selection procedures which are valid,
especially users operating under merit principles. [emphasis added].
The language and spirit of Part 1607 is clear that the selection procedures validity must
be well documented and properly performed. Adverse impact existed in the 2014 hiring
10
session, which occurred prior to the 2015 hiring session. As 29 CFR 1607.4(D)
11
describes, selection rates for any group lower than 4/5 of the rate of the group with the
12
highest success will generally be regarded as evidence of adverse impact. The 2014
13
hiring session had adverse impact ratios of .73 for Blacks. PSOF 27. These rates are
14
for the phase of the application immediately following the administration of the BA
15
used in 2014. Therefore, the adverse ratios identified above are a result of the BA used
16
in 2014. As adverse impact exists, the agency was required to perform a validation
17
study. Since the Agency was required to perform a validation study, it was performed
18
in the course of regular agency business and therefore not subject to Exemption 5.
19
There is nothing in the record to suggest that the study would not have been
20
created in substantially similar form but for the prospect of litigation. Even without
21
this present matter, the FAA would still have been required to conduct the validation
22
study.
23
Again, besides being required by law, the study was a part of regular agency
24
business. There is no work product immunity for documents prepared in the ordinary
25
course of business prior to the commencement of litigation. Yurick v. Liberty Mut. Ins.
26
Co., 201 F.R.D. 465, 472 (D. Ariz. 2001) (quoting Tayler v. Travelers Ins. Co., 183
27
F.R.D. 67, 69 (N.D.N.Y. 1998)); see also U.S. v. Fort, 472 F.3d. 1106, 1118 n. 13 (9th
17
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
Cir. 2007) (quoting In re Jury Subpoenas, 318 F.3d 379, 384-85 (2nd Cir. 2003)) (In a
criminal case, the Ninth Circuit agreed with the 2nd Circuit that the privilege would not
apply to materials in an attorneys possession that were prepared . . . by] a third party
in the ordinary course of business and that would have been created in essentially
similar form irrespective of any litigation anticipated by counsel).
In U.S. v. Textron Inc. and Subsidiaries, 577 F.3d 21, 30 (1st Cir. 2009), the case
involved [a] set of tax reserve figures.. Despite the dispute arising with the IRS, the
First Circuit found the ordinary business rule applied straightforwardly and found
them to be prepared in the ordinary course of business. Id. The first circuit reasoned
that [e]very lawyer who tries cases knows the touch and feel of materials prepared for
a current or possible . . . law suit . . . . No one with experience with law suits would talk
about tax accrual work papers in those terms. Id. The figures were for the purpose of
supporting a financial statement and the independent audit of it. Id.
Similarly here, as evidenced by the FAAs continued practice of conducting
validation studies in 2016, they are not talking about these studies in the terms of
litigation. PSOF 23. Just as corporations have the regular imperative to acquire
accurate financial statements, so too does the FAA have the regular imperative to
ensure that it is using a test that is selecting highly qualified candidates. The studies
were meant to be an independent verification that the tests were of the proper caliber.
As a result, the validation study and its summary fail to pass the because of
test and were prepared through regular agency business, thus were not created in
anticipation of litigation. Therefore, it is not protected by work-product privilege.
23
24
(b)
Substantial need/undue hardship and balancing of

interests overcome privilege
25
Assuming arguendo that the work-product privilege applies substantial need,
26
undue hardship and balancing of interests trump that privilege. The privilege derived
27
from the work-product doctrine is not absolute. U.S. v. Nobles, 422 U.S. 225, 239
18
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
(1975). The scope of the doctrine entails one needing to balance [the] competing
interests of the privacy of a mans work on one end against the fact that public
policy supports reasonable and necessary inquiries. Hickman v. Taylor, 329 U.S. 495,
497 (1947). Fed. R. Civ. P. 26(b)(3) permits disclosure of documents and tangible
things constituting attorney work product upon a showing of substantial need and
inability to obtain the equivalent without undue hardship. Upjohn Co. v. U.S., 449 U.S.
383, 400 (1981). [W]hen documents have been generated by the government[,]
scrutiny of a claim of privilege by an attorney of the government is even more essential
. . . where many attorneys function primarily as policy-makers rather than as lawyers.
See Coastal Corp. v. Duncan, 86 F.R.D. 514, 521 (D. Del. 1980); see also Texas Puerto
Rico, Inc. v. Department of Consumer Affairs, 60 F.3d 867, 884 (1st Cir. 1995).
The Ninth Circuit Case, U.S. v. Christensen, 801 F.3d 970, 983 (9th Cir. 2015),
is applicable on this matter. In Christensen, the defendant hired a third party to wiretap
an individual who was in a dispute with one of the defendants clients.. In that case, the
Ninth Circuit found that the work product doctrine did not apply. Id. at 1009. The
Ninth Circuit reasoned that the purpose of the work product privilege is to protect the
integrity of the adversary process. Id. at 1010 (quoting Parrot v. Wilson, 707 F.2d
1262, 1271 (11th Cir. 1983)). It did not apply to foster a distortion of the adversary
process by protecting illegal actions . . . . Christensen, at 1010. It would indeed be
perverse . . . to allow a lawyer to claim an evidentiary privilege to prevent disclosure
of work product generated by those very activities the privilege was meant to prevent.
Id. (quoting Moody v. I.R.S., 654 F.2d 795, 800 (D.C. Cir. 1981)).
In this case, FOIA was passed by Congress to also ensure the integrity and
openness of its government. Similar to the Ninth Circuits reasoning, work-product
privilege does not protect illegal actions. This is especially true in the FOIA context.
If the FAA did indeed discriminate in the testing process, the validation study was
bound to be work-product generated by that illegal conduct since it was required by
19
1
2
3
4
5
6
7
law to be done. It would be perverse to allow the government to shield such

documents that were to follow such conduct, and thus diminish the integrity of
government. Not only this, but the public has the substantial need to make sure that our
air traffic control facilities are being manned by properly trained individuals so as to
avoid needless endangering of lives and harming of worldwide commerce. In terms of
undue hardship, the FAA is the only one with access to the test. Plaintiff does not have
the option of hiring another to do its own validation.
8
9
11
12
10
13
14
15
16
17
The validation study is not work-product. The summaries composed by APT

Metrics at the behest of Agency Counsel was not written by Agency Counsel. The
summaries Agency Counsel requested from APT Metrics were prepared by an
industrial organizational psychologist, John Scott and his staff, concerning the
validation study. These summaries were sent to FAA counsel. DSOF 12-13. APT
Metrics is not the FAAs counsel. It is not material produced by the Agencys legal
staff.
As a result, after balancing the various interests, the substantial need, and the
undue hardship, work-product privilege cannot stand to protect the validation study and
the summary.
18
19
20
21
22
23
24
25
26
(c)
Even assuming the study and summary are

covered by privilege, the FAA waived that
privilege
Assuming work-product privilege still remains valid, it is irrelevant since the

FAA waived its privilege to the validation study and the summary. [W]ork product
immunity may not be used both as a sword and a shield. Where a party raises a claim
which in fairness requires disclosure of the protected communication, [these
protections] may be implicitly waived. Verizon California Inc. v. Ronald A. Katz
Technology Licensing, L.P., 266 F.Supp.2d 1144, 1148 (C.D. Cal. 2003) (quoting
27
20
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Columbia Pictures Television, Inc. v. Krypton Broadcasting of Birmingham, 259 F.3d

1186, 1196 (9th Cir. 2001)).
In Kintera, Inc. v. Convio, Inc, 219 F.R.D. 503, 513 (S.D. Cal. 2003), the District
Court reasoned that it is apparent that important and significant portions of the
witness affidavits were disclosed on [plaintiffs] website. The Court in that case held
that it would be inconsistent [with the Electro Scientific courts rationale] to find
[plaintiffs] document maintained privileged status after such a disclosure . . . . Id. The
Court also referenced how the Electro Scientific Courts decision based its ruling on
the fact that the party in that case intentionally disclosed the information in an act
calculated to advance that partys commercial interest. Id.
Similarly here, the FAA has taken to the practice of selectively publishing its
validation studies over the years. PSOF 9, 23. When nothing appeared wrong and it
reflected them hiring qualified applicants, the FAA showed it to the public as the
plaintiff did in Kintera. Now all of a sudden, the FAA seeks to claim such studies,
which are mandated by law, as being work-product privilege. Coincidently, this shift
in agency policy concerning the disclosure of the validation study comes after the
changes to the hiring process. Again the purpose of work-product privilege is to protect
the privacy of the attorney. After sharing such things to the world, it is dubious to now
claim that the public should now respect its privacy. Plaintiff is not out to acquire all
documents related to the subject matter of tests it simply seeks the studies.
Additionally, the FAA told the Vice President of the United States, Congress,
rejected applicants and the media that the test was validated. PSOF 11-14, 16, 26.
After making such a widespread declaration, when it is asked to put their money where
their mouth is, they refuse to disclose. The FAA cannot have it both ways. This point
is especially strengthened by the fact that the hiring process reflects an agency policy
of the FAA. See Nat'l Council of La Raza v. DOJ, 411 F.3d 350, 360-61 (2d Cir. 2005)
(stating that attorney-client privilege's rationale does not apply documents that reflect
21
1
2
actual agency policy). The public should be allowed to fact-check what the FAA has
already stated openly to the world.
3
4
Because the FAA waived its work-product privilege in relation to the validation
study and summary, such documents are not protected under the privilege.
2.
(a)
The Validation Study and Summary Are Not Privileged

APT Metrics is not an attorney capable of providing
legal advice
Both the validation study and the summary fail to fall under attorney-client
privilege. As the title suggests, even before getting into a test to determine attorney-
10
client privilege, one needs an attorney. Since only the study and the summary are being
11
sought, FAA Counsel is claiming that these two documents contained legal advice.
12
(Def.s Mot. for Summ. J. at 13, April 4, 2016). In essence, their assertion is claiming
13
that APT Metrics was an attorney to the FAA. Id. at 19. APT Metrics created both the
14
validation study and the summary. DSOF 3-4, 9-11. Nowhere in the record does it
15
show that APT Metrics or John Scott are authorized to provide legal advice. Such a fact
16
is fundamental to any assertion of attorney-client privilege. Because neither APT
17
Metrics nor John Scott are authorized to provide legal advice, the FAAs argument that
18
the validation study and the summary contain legal advice is meritless. Therefore,
19
neither the study nor the summary of it are protected by attorney-client privilege.
20
IV.
CONCLUSION
21
The spirit and intent of the Freedom of Information Act is to pierce the veil of
22
administrative secrecy and open Agency action to the light of public scrutiny. Agency
23
actions that compromise public safety and then attempts to cover up illegal activities
24
must be revealed for the sanctity of our democratic system. As there are genuine
25
disputes of material fact, the FAA did not conduct an adequate search, and the
26
documents are not shielded under either work-product privilege or attorney-client
27
privilege, Defendants Motion for Summary Judgment should be denied. Furthermore,

22
1
2
3
Plaintiff asks that this Court order the FAA to produce the documents requested and
conduct and adequate search in a timely manner.
RESPECTFULLY SUBMITTED this 25th day of April, 2016.
4
5
CURRY, PEARSON & WOOTEN, PLC
/s/ Michael W. Pearson

Michael W. Pearson
814 W. Roosevelt St.
Phoenix, AZ 85007
Attorney for Plaintiff
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
CERTIFICATE OF SERVICE
I hereby certify that on this 25th day of April, 2016, I electronically transmitted the
foregoing document to the Clerks Office using the CM/ECF System for filing and
transmittal of a Notice of Electronic Filing to the following CM/ECF registrant(s):
Eileen M. Decker
United States Attorney
Dorothy A. Schouten
Assistant United States Attorney
Alarice M. Medrano
300 North Los Angeles Street
Room 7516, Federal Building
Los Angeles, California 90012-9834
Attorneys for Defendants
23
24
/s/ Christine L. Penick
25
26
27
23
1
2
3
4
5
6
7
Attorneys at Law
814 W. Roosevelt
(602) 258-1000 Fax (602) 523-9000

[email protected]
[email protected]
10
WESTERN DIVISION
11
12
13

Plaintiff,
vs.
14
Federal Aviation Administration,

15
Defendant.
16
17
18
19
Case No. CV15-5811-CBM (SSx)

PLAINTIFFS CONTROVERTING
STATEMENT OF FACTS AND
SEPARATE STATEMENT OF
FACTS IN SUPPORT OF
PLAINTIFFS RESPONSE TO
DEFENDANTS MOTION FOR
SUMMARY JUDGMENT
Hearing
Date: May 10, 2016
Time: 10 a.m.
(Before the Honorable Consuelo B.
Marshall)
20
21
Plaintiff submits this Controverting Statement of Facts and Separate Statement
22
of Facts in Support of Plaintiffs Response to Motion for Summary Judgment
23
(PSOF). The facts of record show the impropriety and inaccuracies of Defendant
24
Federal Aviation Administrations (FAA) Statement of Facts. The controverting facts
25
support a finding that Defendant is not entitled to summary judgment, and that
26
Defendants Motion for Summary Judgment must therefore be denied. These
27
inaccuracies also support a finding that FAA is not eligible for summary judgment for
1
2
this Freedom of Information Act (FOIA) action as Defendants statements are

controverted. Military Audit Project v. Casey, 656 F.2d 724, 738 (D.C. Cir. 1981).
PLAINTIFFS CONTROVERTING STATEMENT OF FACTS
1.
5
6
record available concerning the Agencys review of the process for hiring of the Air
Traffic Control Specialist (ATCS) position, it is clear that the Agency did not
undertake a comprehensive review. Exhibit 1: December 8, 2014 Letter from FAA
Administrator Michael Huerta.

2.
10
Plaintiff denies Defendants Statement of Fact (DSOF) 1. Based on the
Plaintiff denies DSOF 4. APT Metrics developed a different examination,
11
the Biographical Questionnaire (BQ) test. Furthermore, John C. Scott, along with
12
numerous other FAA officials, called the exam a questionnaire in 2014. Exhibit 2:
13
2014 Biographical Assessment and 2015 Biographical Assessment. Exhibit 3: January
14
8, 2014 Telcon Transcript. Exhibit 4: Joseph Teixeira Email. Exhibit 5: Matthew Borten
15
Statement.
3.
16
Plaintiff is unable to make a characterization of DSOF 5. The purpose of
17
this civil action is to identify the ability of the 2015 BA to identify the characteristics
18
needed for the ATCS position. Exhibit 18: Jorge Alejandro Rojas (Rojas Affidavit)
19
Affidavit 4.
4.
20
21
2: 2014 Biographical Assessment and 2015 Biographical Assessment.

5.
22
23
26
Plaintiff avers DSOF 12. The 2014 and 2015 exams were starkly different.
Id.
PLAINTIFFS SEPARATE STATEMENT OF FACTS
24
25
Plaintiff avers DSOF 9. The 2014 BA was substantially different. Exhibit
6.
The Federal Aviation Administration (FAA) changed the hiring practices for
Air Traffic Control Specialists in December of 2013, taking in to effect in February
27
1
2
2014. Exhibit 6: FAA, ATC Hiring Stakeholder Briefing on Hiring, at Slide 2.

Exhibit 7: FAA Letter to Collegiate Training Initiative Schools Graduates.
3
4
5
6
7
7.
who were on a Qualified Applicant Register or other list of candidates. Individuals were
negatively impacted as they were forced to reapply under the new hiring system and
not all were selected. Exhibit 8: National Black Coalition of Federal Aviation
Employees ATC Hiring update from the National President, at page 1.
8
9
11
12
10
13
14
8.
17
Training Initiative Schools Graduates, at page 1. Previously the Agency used to hire
from a group of schools approved by the Agency, offering aviation specific education.
Individuals were required to take the Air Traffic Selection and Training (AT-SAT)
exam. Exhibit 9: FAA, Air Traffic Collegiate Training Initiative (AT-CTI),
8/10/2011 to 2/25/2014 Website, at page 1.
9.
20
21
22
23
24
25
26
27
The Air Traffic Selection and Training (AT-SAT) was previously extensively
validated by the FAA, as indicated by a multi-volume document released by Defendant.

Exhibit 10: AT-SAT Validation Documents.
18
19
The Agency changed to only requiring a four-year degree in any field, or three
years of work experience, or a combination of both. Exhibit 7: FAA Letter to Collegiate
15
16
The new practices included removing a list of about 2,000-3,000 individuals
10.
FAA Spokesman Tony Molinaro, said the FAAs decision to modify the
hiring process was to add diversity to the workforce. Exhibit 11: INFOURM Want
to be an air traffic controller? UND says FAA has dumbed down the process, at page
1.
11.
The Agency, in a January 8, 2014 telephone conference, stated that a
Biographical Questionnaire would be used in 2014. The Agency further stated that
the exam was designed, developed and validated through the FAAs Civil Aerospace
Medical Institute (CAMI). Exhibit 3: January 8, 2014 Telcon Transcript.
12.
John C. Scott, Chief Operating Officer of APT Metrics stated that a
Biographical Questionnaire would be used in 2014. Id., at page 8.

3
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
13.
The Agency also provided several letters and statements concerning the use
and validation of the Biographical Questionnaire in 2014. Exhibit 4: Joseph Teixeira

Email. Exhibit 5: Matthew Borten Statement.
14.
The Agency stated that the 2015 Biographical Assessment was newly
developed and empirically validated. Exhibit 6: FAA, ATC Hiring Stakeholder

Briefing on Hiring, at Slide 3.
15.
The Agencys 2014 Biographical Assessment was very different than the
2015 Biographical Assessment. Exhibit 18: Rojas Affidavit 5. Exhibit 2: 2014

Biographical Assessment and 2015 Biographical Assessment.
16.
The Agency notified applicants that were rejected and that they failed because
of the validated biographical exam. Exhibit 12: BA Rejection Notices.

17.
FAA Administrator Huerta admitted that the job-task analysis and the
validation was not completed until the end of 2014 significantly after the 2014 exam.
Exhibit 1: December 8, 2014 Letter from FAA Administrator Michael Huerta.
18.
The Agencys responses to previous FOIA requests 2015-008178 and 2016-
000431 reveal that 2,407 passed the biographical exam in 2014, while 28,511 applied.
Exhibit 18: Rojas Affidavit 6.
19.
The Agencys responses to previous FOIA requests 2015-007021 and 2015-
009349 reveal that 5,083 passed the biographical exam in 2015, while 18,302 applied.
Exhibit 18: Rojas Affidavit 7.
20.
Plaintiff submitted Freedom of Information Act request 2015-006130 on or
about May 20, 2015. Exhibit 13: 2015-006130 Acknowledgment letter.

21.
During conversations with Defendants counsel, it was made clear that the
documents sought were the validation study and related communications regarding the
examination. Exhibit 18: Rojas Affidavit 8.
26
27
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
22.
Defendants counsel Alarice M. Medrano stated to Plaintiff that it was her
understanding the wrong years of records were reviewed responsive to Plaintiffs

request. Exhibit 14: Email from Alarice Medrano.
23.
The Agency continues to perform validation studies of the AT-SAT, in the
normal course of Agency business. Exhibit 15: February 11, 2016 FAA Memorandum.
24.
The Agency admitted that APT Metrics is a company of human resources
consultants. Exhibit 1: December 8, 2014 Letter from FAA Administrator Michael

Huerta.
25.
APT Metrics website provides several references in support of a finding that
validation studies should be disclosed and that the advice provided was not provided in
the capacity of an attorney. Exhibit 16: APT Metrics website & Testing the Test
Powerpoint.
26.
Agency sent a letter to the Vice President of the United States concerning the
validation of the examination. Exhibit 17: Letter to Vice President Joe Biden.
27.
The adverse impact ratios for the 2014 hiring announcement were compiled
based on responses to FOIA requests 2015-008178 and 2016-000431. Rojas

Declaration 7.
18
19
RESPECTFULLY SUBMITTED this 25th day of April, 2016.
20
21
CURRY, PEARSON & WOOTEN, PLC
22
/s/ Michael W. Pearson

Michael W. Pearson
Kyle B. Sherman
814 W. Roosevelt St.
Phoenix, AZ 85007
Attorney for Plaintiff
23
24
25
26
27
///
1
2
3
4
5
6
7
8
9
11
12
10
13
14
CERTIFICATE OF SERVICE
I hereby certify that on this 25th day of April, 2016, I electronically transmitted the
foregoing document to the Clerks Office using the CM/ECF System for filing and
transmittal of a Notice of Electronic Filing to the following CM/ECF registrant(s):
Eileen M. Decker
United States Attorney
Dorothy A. Schouten
Alarice M. Medrano
300 North Los Angeles Street
Room 7516, Federal Building
Los Angeles, California 90012-9834
Attorneys for Defendants
/s/ Christine L. Penick
15
16
17
18
19
20
21
22
23
24
25
26
27
Case 2:15-cv-05811-CBM-SS Document 27-1 Filed 04/25/16 Page 1 of 342 Page ID #:196
EXHIBIT 1
U.S. Department
of Transportation
Office of the Administrator
800 Independence Ave., S.W.

Washington. D.C. 20591
Federal Aviation
Administration
December 8, 2015
The Honorable Kelly A. Ayotte
Chair, Subcommittee on Aviation
Operations, Safety, and Security
United States Senate
Washington, DC 20510
Dear Madam Chair:
Thank you for your July 13 letter, cosigned by your congressional colleagues, about the Federal
Aviation Administration's (FAA) revised hiring process for entry-level Air Traffic Control
Specialists (ATCS) and requesting infonnation about the results of the three rounds of hiring
pursuant to the recently revised process.
As you know, the FAA maintains the safest and most efficient aerospace system in the world
partly because we continuously evaluate and strengthen our ATCS hiring and training processes.
The 2014 and 2015 changes to the ATCS hiring process furth er that commitment. This ensures
that we use an efficient and fair process aimed at selecting those applicants with the highest
probability of successfully completing our rigorous ATCS training program from among a large
and diverse applicant pool.
The ATCS position has been and likely will continue to be a highly sought-after and well-paid
Federal occupation for which qualified applicants significantly outnumber available positions. In
2012, the FAA undertook a comprehensive review of the current ATCS selection and hiring
process as called for by the Equal Employment Opportunity Commission. This review and
subsequent analysis indicated a number of concerns in the FAA ATCS hiring process, including
the use of hiring sources, the Air Traffic Selection and Training Test (AT-SAT), and the
Centralized Selection Panel.
Accordingly, given these concerns in 2013, the FAA undertook a comprehensive analysis of how
to improve the current ATCS selection and hiring process. The FAA retained industrial
organizational psychology consultancy, Outtz and Associates, along with nationally recognized
human resources consultants, APTMetrics, to conduct a thorough review and analysis of the
ATCS hiring process, recommend improvements, and assist in implementing those
recommendations. A tMctrics' work was scheduled to last 2 years. concluding at the end of
2014.
2
While this work continued, the 2014 Controller Workforce Plan identified the need to hire and
train 1,286 air traffic control specialists at the FAA Academy. This required developing a
selection process to effectively evaluate the expected surge of applications in a timely and
cost-efficient manner. As a result, in February 2014 the FAA implemented the 2014 Interim
Hiring Process for one-time use, incorporating as many of APTMetrics' initial recommendations
as practicable including:
Ending the use of large inventories segregated by applicant source and unre lated to
then-current hiring needs;
Opening a vacancy announcement available on the same terms to all sources (all U.S.
citizens) to ensure equitable treatment and the broadest pool of qualified candidates;
Eliminating the ineffecti ve, time-consuming, costly and un-validated subjective selection
procedures associated with Centralized Selection Panels and candidate interviews; and
Developing and substituting the Biographical Assessment (BA) as a stand-alone, initial,
objective selection test, in place of the AT-SAT's Experience Questionnaire subtest, which
had lost its val idity. The BA is a computerized test that measures important and
demonstrably job-related personal characteristics of applicants.'
For the Interim Process, the FAA chose the BA as the first step of a mul ti-step process to identify
the most qualified job applicants. That decision reflected detailed review of each AT-SAT
subtest's predictive validity (i.e., how well it differentiated successful from unsuccessful
candidates). which revealed that the Experience Questio1maire (EQ) did not accurately predict
success in proceeding through the FAA academy or attainment of Certified Professional
Controllers (CPC) status at the first facility.
APTMetrics developed and validated the BA using years of research and data gathering by the
FAA's Civil Aerospace Medical Institute for three different biograph ical instruments, including
the EQ when it was part of the AT-SAT. The BA measure required personal job-related
attributes and was validated to I) predict pass rates at the FAA Academy, and 2) predict
ce11ification of an ATCS at hi s or her first assigned facility. Notably, the validati on wor ,
indicated the BA had a high-level of valid it with little adverse effect on any discrete grou1;1 or
subgrou of test-takers.
The Agency also removed the interview stage of the hi.ring process for several reasons. The
questions used in the intervie~ were commonly shared online, and the interview process yielded
an historical passing rate approaching l 00 percent. Thus, and most importantly. the interview
added little value in the selection of ATCS. Further, the interview process was not standardi zed
or validated, and the managers conducting the interviews had little or no training o n proper
interviewing procedures. Moreover, the Agency's decision to assign facilities after training.
rather than during the selection process, made it impossible for managers to interview candidates
that would repo11 to their facilities. Finally, some have raised the concern that the interview
screened for language barriers, the ATCS application asks the candidates to confirm their ability
to speak English clearly in the san1e way it asks applicants to confirm they satisfy the maximum
1
These include flexibility: risk tolerance; self-confidence; dependability; resilience; stress tolerance; cooperation;
teamwork; and rules application.
3
entry age of 31 years. The FAA will periodically evaluate and update interview guides and
interview process for future announcements.
As a result of these changes, the 2014 Interim Hiring Process became more efficient,
economical, and transparent. We significantly reduced applicant processing time and in 20 14
saved more than $8 million in AT-SAT testing costs by using the BA as an initial screening tool.
Additionally, under the legacy process, applicants could be placed o n inventories for years and
have no understanding of whether they would ever be hired by the FAA and sent to the FAA
academy. Under both the 2014 and 2015 announcements, applicants who passed the selection
hurdles received a Tentative Offer Letter. Those who successfully completed the remaining
medical and security clearances were assured a position and received an estimated date to start
their academy training.
Moreover, by opening the announcements to all sources in the general public, the revised hiring
process as reflected in the chart below. significantly increased the representation of women who
successfully completed the assessment process and to vari ous extents increased the
representation of racial and ethnic minorities, as compared to the Agency" s legacy selection
processes.
Gender
Female
Male
Declined to
Respond
Ethnicity
Multi-ethnic
Hi spanic or
Latino
Asian
Black or
African American
American Indian
or Alaskan Native
Native Hawaiian
or Other Pacific
Islander
White
Declined to
Respond
CPC Population
Interim 20 14
N= l 1567
Percentage N=l593
1855
16%
260
9712
84%
651
0
682
CPC Population
N= ll567
Percentage
1.1 %
123
782
Percentage
28.5%
71.5%
Interim 2014
Percentage
N= l 593
5.3%
48
6.8%
153
16.9%
270
2.3%
57
6.3%
623
5.4%
93
10.3%
86
0.7%
.4%
29
0.3%
.7%
9654
83.5%
544
60.1 %
688
The 2015 ATCS hiring process is substantially similar to the 20 14 interim process, with a
number of modifications. First, in 2015, the Agency used a newly refined BA. The 2015 B
4
was developed using the newly com~leted 2014 job analysis of the A TCS position, \vhi_cb'
identified the critical and important re uiremenls of the ATCS job. The new BA measures the
knowledge, skills and other characteristics that could most readily be assessed with a biodata
instrument, including those attributes that are not substantially assessed by the AT-SAT. A total
of 1,765 current air traffic control specialists participated in the job analysis study and over 1,000
CPC and their managers contributed to validating the BA. This approach ensured
comprehensive coverage of important attributes for the ATCS position.
Second, the FAA used an alternate, but equated version of the AT-SAT (excluding the
Experience Questionnaire subtest). When the FAA commissioned the development of the
AT-SAT more than 15 years ago, it had asked for and received 2 comparable versions. Until
20 15, the Agency only used one version. However, in 2015 the FAA switched to using the
second version for security and efficacy purposes. Likewise, as a test security measure,
APTMetrics developed an alternate version of the BA -test, each to be used randomly and
concurrently. with the questions also appearing in random order for each BA.
Fina lly. rhe Agency issued a separate vacancy anno uncement for experienced air traffic
controllers. Due to their prior experience, these candidates were placed directly into the ATC
facility (bypassing the Academy) and are expected to re-certify as Certified Professional
Controllers more quickly than applicants with no experience.
Your letter specificall y requested pass rates at the r AA Academy for students hired under the
interim and 2015 process. Understanding that these remain preliminary results, the results of the
February 2014 general public vacancy announcement as follows:
1,593 were selected;

I, 113 reported to the Academy;
684 have successfully completed Academy training;
3 13 have separated from the Academy; and
116 remain in training.
T he observed and intended effect of these Academy grading changes has been to substantially
increase the failure rate of new trainees while they are at the Academy rather than have them fail
after being assigned to their first Tower or En Route Center. Failures at a faci lity incur greater
costs to the Agency than failures occurring earlier in the process. In addition, as described
below, changes in Academy failure/passing rates must distinguish between those in the
Academy's En Route training program from those in the Academy' s Tower training programs
because the Academy modified the grading requirements of each track at different times.
At this time, it is too early to draw any valid conclusions about the success of the Interim hiring
process. Furthermore, drawing any statistically valid conclusions of this cohort 's training
performance is complicated by the FAA's decision to make the FAA Academy grad ing more
rigorous. Any meaningful analysis must account for changes in assessment at the FAA
Academy. The Agency has not yet conducted that analysis.
5
1. En Route Centers. Changes to the En Route Qualification grading were implemented at the
Academy in 2011; since then, the team of personnel that administer evaluations continues to
adjust the process to improve the consistency of such evaluations. As expected, the graph below
depicts a declining class pass rate trend coincident with the introduction of strengthened grading
requirements beginning in 201 I through and including 2015. Immediately after the En Route
training program adopted its assessment changes in 20 11 , Academy pass rates declined by
12 percent, from 98 percent in 2010/2011 classes to 86 percent in 2011. Further, as the FAA
continues to calibrate En Route grading process, the pass rate has continued to decline. Prior to
the use of the Interim process in FY 2014 Quarter 4, the most recent Academy pass rate was
68 percent (FY 2014 classes in Quarters l - 3) for the En Route training program. Trainees from
the Interim process were phased in during FY 2014 Quarter 4 (July l to September 30) and of the
handful of those trainees that have attended the Academy, 64 percent have passed.
,--------- ----.- --.---- ----
- --
Academy En Route Pass Rates Over Time

100%
90"1<>
80%
7Cf%
60%
500/ci
1100/ci
30%
200/ci
1001<>
0%
Pre-Curriculum Change
-ID- Post-Hire Change
Post-Curriculum Change
- Linear (Post-Curriculum Change)
*Calculation Notes: Fiscal year based on a start date of 1Oil with l 01l to 12/31 being Quarter 1;
Pass rate based on # of passes I (# of passes + # of fails); for display purposes FY 2011
Quarter l calculation includes one class from FY 20111Quarter2, as it used the assessment
approach; Initial FY 2014 Quarter 4 classes contained ,a mix of pre and post hire changes
students. As such FY 2014 Quarter 4 classes are not presented in the graph.
2. Towers. The FAA implemented changes to the Tower Qualification grading regime at the
Academy concurrently with the initial hiring from the t 014 Interim process. Unlike with the
En Route Qualification regime, this means there is no data post-grading-change that does not
also include the initial hires from the 2014 Interim process, precluding an " apples-to-apples"
comparison.
Finally, your letter asked about students who have been determined to be unqualified at the
Academy for non-academic reasons. No student has been removed based upon a failure to be
qualified. Rather, four selectees have been removed from the FAA Academy for cause since we
6
implemented the interim hiring process in 2014. Removal for cause could include behavioral
in fractions independent of suitability, such as driving while under the influence of alcohol or
cheating at the Academy. This rate is consistent with the previous rate of terminations.
Notably, the screening employed under both the Interim Process in 2014 and the 2015 process
has led to the dismissal of some candidates before they begin training. For example. as
previously indicated, under Basic Qualifications on the application, the first question asks, "Are
you able to speak English clearly enough to be understood over radios, intercoms and simi lar
communications equipment?" In the past 2 announcements, I 16 applicants responded no and
were eliminated from the process.
We operate the safest and most efficient airspace in the world , in large part due to our dedicated
workforce. As we close out the 2015 hiring process, we are working diligently to bring on board
those hired during that process. We wi ll continue to build on the improvements made over the
past 2 years to find addi tional efficiencies while hiring those most likely to become certified
professional controllers.
We have sent an identical response to each of the cosigners of your letter.
lf I can be of further assistance, please contact me or Molly Han-is, Acting Assistant
Administrator for Government and Industry Affairs, at (202) 267-3277.
Sincerely,
~~rt<D
Administrator
EXHIBIT 2
20. Relative to other h1ah school stude11ts In my maJor field of study,

t . . c;~;:.:~:~
Biogr aphical
Assessment - Page 1
l
I
I
I
mod liklv
docnhe
my
ocade nuc
wo'k .,
mv most deniand1na
6. Ah1nc ewe:-09c
1. How would you descr ibe your idellll job?
A. AUilitv to decide (CY myself what to do a nd how to do it

Bein~
tola what to do1 but not how :o do it
c. Beine mid Nhat to do, an d oettino suooestions-on. how to do lt
r am not sure
l. In the p.ad:, whilf d id you do when you were working on 1;omethi11g -.nd not_hing !ieemi!cd
o,ioht'
l
l
I
I
II
A. I to~k a break from working to relsx and !:ta rt~ again s-om10time later .
6 . I ccminued wor.<1ng, but I l'IOS frustrated and upset.
C. l let othe~ kr .ow ~ wae al"lg'"'f, th.en l b.1;9ar1 to foel better ~nd o:intir.uad working.
o. J too< the day as K came and med to remain uooeat.
E. I .5toi:ped whot r wos- doing, figunng th1r.o;is- wou'd go behcr the next doy.
A. Lack of ability
S. Not try1r.g h.ird enough
G~als
D. Bod luck
B. You need connections to important people to be an outstanding employee on most jobs..

C. It takes time to be an outstandina employee on most jobs.
tkill~
E. Didn't pla r :oporb
l l. :~: hio h chool g'"de I "'""oft~ eceived wa"
13. I often start thi119s I do not f inish.
A. True
a. o
B False
c .c
0 5 1:1
pcr!lion wha is very
A. determiried
1
1
E. It take:~ hord work ond persistence to be or. outstanding employee on
mo~ j ob ~.
A. have an extreme a!Tlount cf tolerance.

6. get frustrated O'lly when things go very wrong.
C. would benefit if I got frustrated les,s often.
0. have a oood balance beh11e en oettino tco frustrated or stavma too calm.
E. h i!Ve no tolera~e for m i&take!:.
6. Which of the following is your greatest strength?

A. Ability to follow instr uctions
6. Work ettiic
C. Pra. cbcal knowledge a. nd work experience

D. H~h quality standar ds
E. Ability to wc,.k well w1fo others
7.1 am more
A.eager
ccm-s1der~te
8. More classmates would remember me as

A. h"mble
6 . dominant
9. Others would he 01ost likely to describe me as a person with great
I:
~ys1ca l
nv lowe>t g'ode> wo5'
23. Duri n!J n1y last year in college, my avcrag(l numbar of hours of paid en1ploymant per
week Wlls:
A. """ th JO hou"
8. L0t-B2Cl hours
D. Non e
E. Didn't oo to colleoe
Edu catmn
~~~~:owing, the college sub)e<l In which I 'eceled my lowest g, ade5 wa"
24. : : :college fngl;,h gode I mo>l often cce;vcd wo"
1
8. B
c. c
o. D CY lower
D. HK;toryf Political Soence
:;_.Didn't take
17. t learned about the opportun1tv lo at>PI Y for an Air Traffic Control Speoallst { ATCS) Jol:l
through:
I
1
J... A P'Jblic notice er medio .odvo::rt1.5ement

A fri e nd or n:::le.b'vc::
C. College -ccruitmer.t
O. Wo..king in some other eapocity for the o~er.cy
E. Scme otlu:r '"rl'Y
18. ' ;,::;"' ~i< w . . . . .,

~. r1or-career. enl:sted
c. tloncareer officer
O. Corccr cnhstcd
E. Ccircc::rofficcr
The aspe~
. of biDing an a.ir trafHc.controllcr that a. ppe.als to m e most is that:
~. My job 1s secure in the future
o or lo:11er
C Fewer tha r. 10 houn
E. Did not atterd c ol!eig-e
f}.
D.
E. Don't r emember
8 respectful
E.
16.
,and a bilibet to be an ouktanding e niployee on mo!:t job&
5. People who know me wou ld say tllat when things go wrong, I
B. persisterce
D. l
A. oleasmg oer.r onality
B. inquiring mird
D. Hittory/Sccial ~ienc:H
A. It takes a let of lud: to be an outstanding employ ee on most johs.
A.drive
6 .3
C.2
C. Engtis;h
4. Which of the following statements do you agree with the HOST?
I
I
A. o.-MoE
B. cons!derate
8 _ l\lath
E. Some. other cause
6.
21. The m1mber of different high i;:chool s.port!O I p articipated in was:
A.thoniui;ih
12. I would rather be known far n1y
I
I
I
ovcre~e
E. Don,t kno N
15. :":::~:;:chool ~bjecl in whi<h I 'ecc<ved
that are too dfffccult
O . You need t pe cific
8-. accurate
14. I would rnthcr be known
h11l l11H b.cc~ Ll1c major CllU!;C of your failure!;?
C.
C. ~ve reac
D . Si::\o w
11. I would rather be known as a person who is
D. Being cl<iarly told what to d tt and how to do it
E.
I
I
When it comes to getting work done, I am

A. fast
5. fll t>e reiGonsiote for the safetY -0f many others
c. J'll r eceive a good salary. hhict- w11l orow

1
D. J'll be constantly challenged to resolve situa1ions. which 3'"ise

E. Ttie work .,.,ill a!wa rs be intere.stino
E n gli ~h
or didn' t go to college
2!>. Jn t he three y ears imn1ediately b e fore .apply i ng t o this. job, the number Df d ifferent full
l
l
l
art-time jobs J applied for was:

4. IJcne
a. l to ::!
c.
3 (() ~
D. 5 t o6
E.1 or more
26. The number of 111011lh~ I

lyino to this job'"
w~
uncrnploycd during the thr ee yccirs imnn:dialcly bcf11rc
A. O
s.
~ to
C.. 3i"o4
D. S to6
E 7 or more
27 . Which of th.e followino would your peers say des cribes your behavior in a t1roup
auon?
.o... r ou freefy expre- your Vfe'NS a nd sway the .group -consideraDly
3 . Yau frel'!!y e"!press: your v~ws;. but th<!. group
do~n't a l~ys
s:harl! them
C. Yc;u ore r cludcnl to c,.;prc:s-:s you~ v1cw:s, but whc r- y1Ju do. they o:tre U:M.Jally Vl'cll r cccivc::-rl
D. Yc;u usually don't e:xor ess vour v1ei,\S
: . Don't know
Case 2:15-cv-05811-CBM-SS Document 27-1 Filed 04/25/16 Page 10 of 342 Page ID

#:205
51. Oo vou h'!'!Vf"' 11rior Tnstrun1ent flit1ht Rulet: (TFR) e>erurlence'1
A No
B.Vc$
"'111r.ot')
c. Yes - 0"11ian
0. Yes - Both ~1litary and Cr111(i!ln
S,. How long dn you think It will take to become -fulty @[email protected] in your jnb1
A .,...., """"'th~ mo<: olluo~
O. Somc,,hol kng::r thor most othc~
C. About as Iona as rrost others
59. At which one of the: followlng do you excel?

A.. Setting challeno1no goals for myself
O. A httle less bme than most others
E. Much less ume than most oth~rs
A.Ye >
E. Something other than these
54. Whidt of the followino BEST describes aviation coursework taken towards vour
ciale and/ or Techniul/ Hil1tary/ Voutional-Techn1cal degree?
a. do a qu1dc overv1eN of my work and then move on to something el~.
c. Bacc.al.:tu..-ca:e-tran&f41t' 0'1ert.ec
o.oner
c. as.c: someone with mofe attent>on to detail to re\'le,lrj my ~or1c.
O. check my work: carefully.
E. Not applicable
SS. Of all the Air Traffic Control Speciali5ls (ATCSs) in the country, at what percentile do
you think you will be able to perform?
A. :n the loest 10%
c. At about the 50% or average level
O. I am oot bothered by 1t.
ion,.
E. I ne ... ef do poor qu1l1ty worlc..
56. Overall collogo grade point average
62. How have you planned your work activities al work, ~chool, or in other similar
A. Did not ottend college
~ltuaUon~?
B. 2.00 or betow
not exact.
c. I usually drafted plans.
:a.so
O. I usually did not have plans because Jt lS 11T1portant to let things happe:n freely.
f. 3.51 to 4.00
57. In vour c urrent or previous Job(s), edu cation. or other similar exoeriencH, how did vou
ally feel about assignments changing at the l a.st minute?
A. t d.1d not mind it, but_ preferred that the assignments did not change.
mv ""ot'k.
c. [ hked rt, and it ffi3de my work mor-= challenging.

D. I liked 1t, but preferred to know about changes ~fore the last minute.
E. l have not had an assignment chanoe at the !as;t minute.
58. Which of the following would bother you the LEAST?

A.. -tav1no cc make promises tnat I camot keep
I mode exaa ond detaled plans.
6. J made general plans. but
D. 2.51 to 3.00
A.
c. 2.01 to 2.50
6. 1 did not 1ke it and 1t affec:ad how much I lili:.ed
B. I feel upset 8t the situaoon.
C. I feel angry at whoever 1s: r~spons1ble.
the upper half
E. 3.01 to
E.. check and double-check my work.
61. How does it make you feel to do poor quality work"l'
A. I feel bad obout myself.
B. :n the lower half
E. :n the top
60. When finished with my work, l :

A. tend to not VWO"'/ about m inor details 8nd efficiently mo\le on to othel" acbv1t1es .
A. No av.at1on cours.e work
B. Voc.ation11I oriented
In
c. Keeping trac:k of m1ny detatls
o. Helping others solve thetf problems
B. No
o.
I
I
I
8. Settino pnonbes for gettmg things done
SJ. Do ycu have on Msodalc and/ ol'" Tccfu1icI/ Mil1lary/Vocc.tionol- Tcc.hnic:ol dc::grcc?
B. naving to disappont people who count on me

C. Hav ing to m iss deadlines due to poor plMning
O. Having to take adaintage of people who trust me

E. Having to implement an idea that I don't agree with
E. 1 did not do any of these.
AVIATOR:: 27-1
Online Application
I Step8of 15 Page 11 of 342 Page ID
Case 2:15-cv-05811-CBM-SS Document
Filed 04/25/16
#:206
OMB Control Number 2120-0597
Announcement: FAA-AT0 -15-AU.SRCE~01 66
Job Title: Air Traffic Control Specialist- Trainee
Series :
2152
Status :
In Progress.
Grade(s): FG-1
CloW!g Date :
Date Submitted :
3127/2015
!!! nmeout Waming !!!

For security purposes. your session will end
after an extended period of inactivity. A
waming will be presented to you if your
session will timeout soon.
Important
Typing and/or scro.,g does
NOT ccnstiute "activiy". Please Save

often!
Step 8of15
Please correct the following:
An answer to question #16 is required.
Biographical Assessment - Education and Background

1. What is the highest level of education you have completed?
D d not complete high school.
High School diploma/GED
Attended college but d d not earn a degree
Have earned a two- yea r degree
Have earned a Bachelor's degree
Have earned a Masters degree o r higher
2. If you have earned at least a two-year degree, in what major was that degree? (Check all that
apply)
ngineering (including Aerospace Engineering)
ave not earned at least a two-year degree

3. If you have completed formal aviation courseworlc, please check the following areas that
were covered . (Check all that apply)
Have not completed fonnal aviation coursework

4. How many credits of fonnal aviation coursework have you completed (including courses
with an aviation-specific cuniculum but excluding general education and other degree
uirements)?
ow many credits of fonnal air traffic control coursework have you completed?
1/4
1
I
1
I
I
I
1
I
AVIATOR:: Online
Step 13of15 Page 12 of 342 Page ID
27-1Application
Filed I04/25/16
#:207
101. In: : :: ~s:e~ear, the number of times I have been late for w o rk o r an appo
intment is:
0
three or four times.

five o r more times.
I am never late.
I am not sure.
102. My co-w o rkers o r class mates would probably d escribe me a s a perso n who:
never takes chances.
hardly ever takes chances.
sometimes takes chances.
often takes chances.

ve ry often takes chances.
103 . In yo ur past o r current job( s) and s chool experiences, how did you respo nd when
s omeone aitidzed your w o rk?
1 became frustrated.
I ignored the er ticism .
I tried to understand the reason for the crit cism.
I became somewhat d iscouraged.
I accepted the er ticism.
104. In yo ur current or previo u s job(s) (or s choo l, if not previo u s ly emplo yed ), which of these
were yo u HOST comfortable doing?
Taking on additional work.
Assisting others w t h their work.

Leaming new tasks.
Staying late to complete assignments.
Teaching others how to do things.
105. When you are late fo r an appointment, meeting, o r w o rtc, what is usually the reaso n?
I am so busy t hat I sometimes get behind schedule.
Traffic or public transportation delays.

I sometimes lose track of time.
Other people o r situations get in the way,

I am always o n time.
106. According to yo ur s upervis or and co-workers (or teachers if no t previous ly employed),

how do yo u typically respo nd when the prio rity of tas ks changes?
I get frustrated because I have to re-organize my work.
I finish what I am doing and then move on to the highest pr o r ty task.

I re-evaluate t he prior ties o f all tasks.
I ask to continue my task rather than switching to a new task.
I sta rt on the new, higher priority task immediately.
107. In yo ur current or previo u s job( s), educatio n, o r o ther s imilar experiences, ho w did yo u
us ually feel about assignments changing at the last minute?
1 d d not mind it, but prefe rred that the assignments d d not change.
I d d not like t, and t a ffected how m uch I liked my work.
I liked it, and it made my work more cha llenging.

I liked it, but prefe rred to know a bout cha nges befo re the la st minute.
I have not ha d an assignment change at the la st minute.
108. Which of the follo wing BES T describes yo ur preferred work environment?
I thrive in a changing environment because I do not like repetitive tasks.

I prefer a n e nvironme nt where I can follow a routine.
I like a changing environment where I work on s everal projects/tasks at once.
I prefer a more structured e nvironment.
I am not s ure.
109. In yo ur current or previo u s job(s), educatio n, o r o ther s imilar experiences, ho w did yo u

ically handle unexpected changes in the middle of a tas k o r project?
I wa ited until t hings settled down and then continued with what I was doing.
I kept working on my task so that it could be completed on time.
I changed what I was doing to a ccommodate the unexpected event/circumstance.
I worked on something else tha t was not a ffected by the unexpected change.
I am not s ure.
213
AVIATOR:: Online
Step 13of15 Page 13 of 342 Page ID
27-1Application
Filed I04/25/16
110. People who know me well would describe me as:
#:208
a risk taker.
a very caubous md1v1dual.
a very calm and co llected indrv d ual.
someone who can handle stress w ell.

I am not sure.
111. Your manager (or teacher) would say that when you make an error at work (or school),
are visibly upset.
feel g uilty.
move on qu ckly.
fix t immed ia tety.
do not give t much thought.

112. When my approach for completing a task is not wortcing, I usually:
I
I
continue with t he same approach; it w ill eventually work.
ask a co-worker or classmate how to complete the task.

change my approach.
stop working o n the task .

I am not sure.
113. People who know me would say that when I encounter an obstacle at work or school, I:
in tially struggle a b t to get back on track but eventually recove r.
get frustrated.
have an extreme amount of tolerance.
only get frustrated when the obstacle is major.
have I ttfe tolerance for dealing w th obstacles.
Next
313

#:209
EXHIBIT 3

#:210
1
2
3
4
TELEPHONIC CONFERENCE
IN RE:
FAA NEW HIRING PRACTICES
JANUARY 8, 2014
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
VTranz
www.avtranz.com (800) 257-0885

#:211
1
FAA NEW HIRING PRACTICES
START TIME: 3:00 P.M.
RECORDING FILE NAME: 1_8 Telcon.wav
UNID MALE:
5
6
JANUARY 8, 2014
LOCATION: TELEPHONE
Do you want to make kind of just a comment on

there about the date and time?
UNID MALE 2:
Yes, this is January the 8th, at approximately
3:00 in the afternoon.
telecall -- teleconference call from the FAA on
the new hiring practices.
10
We're waiting for the
We're in Corkey Romeo's office with Kyle
11
Nagle, Ryan Davis, Jim Scott, and Wayne
12
Ressitar.
13
THE OPERATOR:
Welcome, and thank you for standing by.
At this
14
time all participants will be in a listen only
15
mode until the question and answer session, and
16
to ask a question at that time please press star
17
then 1.
18
Today's conference is being recorded.
If
19
you have any objections you may disconnect at
20
this time.
21
And now I'm going to turn the meeting over
22
23
24
25
to Mr. Frazer Jones.

MR. JONES:
Good afternoon, everyone.
Thanks for calling
today's telcon.
In today's call, Mr. Joseph Teixeira, Vice
26
President Safety and Technical Training, Mr.
27
Rickie Cannon, Deputy Assistant Administrator
28
for Human Resource Management, and Mr. John
VTranz

#:212
Scott, Chief Operating Officer APT Metrics, will
provide the latest guidance on upcoming changes
to the air traffic control specialist hiring
process and respond to your questions.
Thanks for you time and expertise today.
We'll turn the call over to Mr. Teixeira.
7
8
MR. TEIXEIRA:
Thank you, Frazer, and thank you everyone for

joining us and good afternoon.
I'm joined today by several FAA officials
10
who are prepared to brief you and to answer as
11
many of your questions as we can.
12
As indicated in my letter to you on
13
December 30th, we'll be providing you a briefing
14
on the improvements we plan to make to the
15
controller hiring process in 2014.
16
you know we've been looking at the entire
17
process of hiring, selection, training, and
18
assignment of air traffic controllers for at
19
least two years.
20
As many of
We began this process in September of 2011
21
when we received 49 recommendations from the
22
independent review panel, a government and
23
industry-wide panel who conducted a
24
comprehensive review of the entire controller
25
hiring process and recommended many
26
opportunities for improvement in that entire
27
process.
28
Also, as I indicted in my recent letter to
VTranz

#:213
CTI schools, we conducted an analysis of the ATC
occupation that was guided by EEOC Management
Directive 715, also known as a barrier analysis.
We were helped in this analysis by two of the
most prominent firms in this field.
Those reports are available to you on our
website if you'd be interested in it, and so is
the reports on the independent review panel.
As a result of these and many other efforts
10
we'll be making many -- we will be making many
11
improvements to the way we select, train, and
12
assign air traffic controllers over the next few
13
years.
14
expect to take effect in February of 2014, in
15
conjunction with a planned announcement to hire
16
new air traffic controllers.
17
18
19
Today we will highlight changes we
Before we do that, however, I would like to

note the things that we are not changing.
We are not planning to change the ongoing
20
support and relationship that exists between FFA
21
and CTI schools, and we will continue to strive
22
for improvements in that relationship.
23
benefits from the education they provide to
24
students and the passion for aviation they
25
engender in students and prospective FAA
26
applicants.
27
28
The FAA
We're also not changing any policies

regarding veterans preference or the president's
VTranz

#:214
1
guidance that federal agencies support the
hiring of veterans of our armed forces.
With regard to the process improvement
we're making, let me recap what those changes
are.
We plan to standardize the application
process and to make sure all applicants use a
single application process for a single national
job and are eligible for national assignment
10
upon successful completion of the academy.
This
11
will not only simplify the application process,
12
but it will help the FAA ensure that individuals
13
are not unnecessarily delayed in getting academy
14
classes.
15
Academy graduates will be assigned to where
16
they are needed and where vacancies exist at the
17
time of their graduation versus the current
18
system of having to evaluate those needs one
19
to -- one to three years in advance.
20
Additionally, we are retaining many of the
21
current requirements such as the SSAT test.
22
However, interviews will not be used for those
23
applying under the February announcement, and we
24
are studying whether to continue their use long
25
term.
26
I know that many of you have questions
27
concerning the status of those individuals who
28
currently have tentative offer letters, and I
VTranz

#:215
1
have gotten a lot of questions from you
directly.
who will give you an update on that and other
topics.
5
6
So let me turn now to Rickie Cannon
Rickie.
MR. CANNON:
Thank you.
Thanks, Joseph.
We have roughly 945 employees who have
tentative offer letters; of that number 543 were
CTI students.
The Agency plans on honoring all
10
of those tentative offer letters, with the
11
exceptions being for things such as those who do
12
not clear security suitability and/or medical
13
issues.
14
We are also having to change some facility
15
locations to meet agency needs, but we will
16
honor, and we expect to honor all those, and it
17
is anticipated that all 543 of the CTI
18
applicants who were extended offer letters
19
should be placed in the -- should be placed into
20
academy classes this year.
21
With regard to those individuals in the
22
inventory who do not have tentative offer
23
letters, we will be sending a letter to those
24
individuals, we believe early next week,
25
indicating to them that the current inventory
26
will be closed.
27
information with regard to the upcoming February
28
announcement and would be pleased with all those
And we will give them some
VTranz

#:216
1
individuals if they retained interest in
becoming air traffic controllers if they would
apply.
With that, I'm going to ask Dr. Scott to
walk you through the few changes being made in
the hiring process starting in February.
7
8
Dr. Scott.
MR. SCOTT:
Thanks, Rickie.
And you know our key focus
since the barrier analysis, really over the past
10
six months, has been to revise the ATCS process,
11
not only to address the barrier analysis
12
findings, but to ensure a fair and balanced
13
selection system for hiring the most qualified
14
candidates for this job, and that has led us to
15
a number of changes and suggestions.
16
17
18
One is the vacancy announcement open to all

U.S. citizens that Joseph talked about.
In addition to the efficiencies created
19
there, we're going to be establishing a single
20
set of minimum qualifications across all
21
applicant sources, which was a critical
22
recommendation that came out of the barrier
23
analysis.
24
And in addition, we're going to eliminate
25
reliance on location preferences so that
26
otherwise qualified candidates are not knocked
27
out of the -- out of consideration based solely
28
on these location preferences, and this was
VTranz

#:217
1
2
another issue raised in the barrier analysis.

Our analyses also looked at areas related
to the testing process.
process removes the experience questionnaire
from the SSAT and replaces it with a new
biographical questionnaire that will now be
placed up front in the application process.
And our revised testing
This change ensures that those candidates
that are referred onto the SSAT who have already
10
been prescreened for education, experience, work
11
habits through this valid and efficient measure.
12
So only those individuals passing the
13
biographical questionnaire will then move on to
14
take the revised SSAT.
15
This leads us to the next process change in
16
the hiring, which is the elimination of
17
centralized selection panel.
18
replaced with an automatic scoring algorithm
19
that takes into account a weighted composite of
20
the bio data along with the revised SSAT, and
21
also accounting for veteran's preference.
22
These will be
These changes were meant to increase the
23
speed and efficiency, and the decision-making,
24
as well as to increase the objectivity in the
25
assessment of candidate characteristics and
26
capabilities.
27
of the process at this point, or a high level
28
overview.
That's sort of a large overview
Okay.
VTranz

#:218
1
MR. TEIXEIRA:
Okay.
Frazer, this is all that we intended to
communicate, would you take us to questions,
please?
MR. JONES:
5
6
Becka, could you prompt our audience for

questions, please?
THE OPERATOR:
Thank you.
And to ask a question, please press
Star 1 and use your phone and record your name
when prompted.
press Star 2, and once again to ask a question,
10
To withdraw your request, you
please press Star 1.
11
First question will come from Scott Miller
12
13
(phonetic); your line is open.

MR. MILLER:
Thank you, gentlemen, I appreciate the
14
opportunity here.
15
a list of questions that a number of us had.
16
Would you be able to start with those questions
17
right now?
18
MR. TEIXEIRA:
I was the one that forwarded
Our preference would be not go through every
19
single one of them.
20
addressed them at least in groups during our
21
initial presentation.
22
we welcome your specific questions at this time.
23
MR. MILLER:
That's a shame.
We believe we have
If that is not the case,
It was indicated in the email,
24
Mr. Teixeira, and I apologize for the
25
mispronunciation a friend of mine spelled it the
26
same way and that's how he pronounces it.
27
indicated in the email back to me that each and
28
every one of those questions would be addressed
VTranz
You

#:219
1
2
during this call.

MR. TEIXEIRA:
Okay.
Im not disagreeing with you.
I'm saying
that I believe we have addressed those
categories of questions in our specific
presentation.
inviting you to ask a question.
MR. MILLER:
Okay.
If that is not the case, I'm
You can go ahead and move onto other
questions now.
airport getting ready to get on an airplane.
I am actually standing in an
10
But I'll have to dig out my laptop and get the
11
questions out so that's going to take me a few
12
minutes here.
13
to allow someone else to ask some questions.
And I gladly relinquish my time
14
MR. TEIXEIRA:
Thank you.
15
THE OPERATOR:
Next question if from Mike Pearson (phonetic).
16
MR. PEARSON:
Yeah, gentlemen, this is regarding the
17
biographical questionnaire; first of all, who
18
designed that?
19
UNID MALE:
The biographical questionnaire was designed
20
through CAMI, and researched as well --
21
thoroughly researched through CAMI, and we've
22
done some additional research with it as well,
23
so it is proven to be a valid instrument for
24
assessing experience, work habits, education,
25
and so on, and dimensions that are related to
26
the success on the job.
27
28
10
MR. PEARSON:
When you said it's been researched by CAMI,

where is that research at; has it been
VTranz
11

#:220
1
published?
UNID MALE:
Yeah, I believe it has.
MR. PEARSON:
Can you -- oh, can you give me a site, or could
4
5
you give me one later?

UNID MALE:
6
7
We -- we'll look into that and get back to you

with it.
MR. PEARSON:
8
9
Yeah.
Is the race -- questionnaire race and gender

neutral?
UNID MALE:
Yes, it is.
10
MR. PEARSON:
Are all questions race and gender neutral?
11
UNID MALE:
Yes, they are.
12
MR. PEARSON:
How do you grade your biographical
13
questionnaire?
14
biographical questionnaire it's subjective.
15
you give us an idea how you're going to do that?
16
UNID MALE:
By the very definition of a

Can
The items themselves have been related to and
17
correlated against performance on the job, and
18
different weights assigned to those questions
19
based on how well they correlate to various
20
dimensions -- performance dimensions on the job.
21
MR. PEARSON:
So the grading itself will be objective, there
22
are measures in place.
23
subjective determining who's grading the test?
24
UNID MALE:
25
26
It's not going to be
That's correct; they're all predetermined,

pre-weighted and already established.
MR. PEARSON:
How come no one in the CTI institutions were
27
asked to have a seat at the table regarding the
28
barrier analysis and the potential impact it has
VTranz
12

#:221
1
2
on the CTI process?

MR. TEIXEIRA:
3
4
These policy changes affect all FAA applicants,

of which CTI are enforcing.
MR. PEARSON:
I understand that.
But I was just basing my
question on your intro about the valuable
relationship you have with CTI institutions.
seems to me if you wanted feedback from the get
go, and I'm not criticizing you, I'm just
asking, was it ever round-tabled or decided who
It
10
would have a seat at the table; I assume certain
11
other special interest groups did?
12
MR. TEIXEIRA:
There were no special interest groups involved
13
in the design of the FAA policy at all.
14
was done by experts in the human resources
15
department and civil rights.
16
This
And we certainly did take from information
17
provided to us by the independent review panel
18
two years ago, and in that review CTIs were
19
involved.
20
MR. PEARSON:
Can you tell me why the students, or the
21
applicants that have been screened for the
22
AT-SAT or CTI graduates, including a very large
23
population and portion of those students are
24
minority students, why the hiring was put on
25
hold for two years, as the FAA needed to hire
26
and was projected to hire over 1,000 controllers
27
per year; why was that done?
28
MR. MCCORMICK: Mike, this is Mike McCormick, I'm the Vice
VTranz
13

#:222
1
President of Management Services for the air
traffic organization and if I could handle that
question for you.
Essentially as you're probably aware of
from all the press that's been generated over
the course of the past 12 to 18 months we were
significantly impacted by the sequestration.
in March of 2013 we had intended to do a central
selection panel based upon a September 2012
10
So
vacancy announcement that we had put out.
11
Unfortunately, that had to be canceled
12
because of the impact of the sequester and the
13
save money furloughs that were implemented in
14
April.
15
As a result of that, also a hiring freeze
16
went into effect on March 1st, and we were
17
unable to hire any new safety workforce into the
18
air traffic organization from March 1st until
19
this very week, when we -- our first opportunity
20
to reopen academy and do hiring in our safety
21
workforce including our technicians and our
22
controllers.
23
So the unfortunate outcome of that is we
24
have not been able to do any hiring up until
25
today.
26
MR. PEARSON:
What about the first part of that, before
27
sequestration took place there, Mr. McCormick,
28
basically the year prior to 2012?
VTranz

#:223
1
14
MR. MCCORMICK: Basically, it's normal that there is a delay
from the time that we do a vacancy announcement
to act to put together a central selection
panel, so that's not an unusual process for us.
MR. PEARSON:
No, I understand that that, but if the delay --
and I -- the sequestration was actually March or
April of 2013; correct, on my timing?
MR. MCCORMICK: The sequester went into effect on March 1st of
2013, the vacancy announcement was published in
10
September of 2012, so a six month lag time is
11
not unusual for (indiscernible) a package in
12
central selection panel.
13
MR. PEARSON:
Okay.
As far as the next issue, I believe your
14
consultant's report, I dont know since we don't
15
really have access to detailed information about
16
how this was formulated, who participated,
17
except you said there were no outside special
18
interest groups.
19
said CTI institutions are not a barrier, or to -
20
- regarding the barrier analysis qualifications;
21
is that not correct, or did I read that wrong?
I believe your report itself
22
MR. TEIXEIRA:
Could you ask that again?
23
MR. PEARSON:
Yeah, I believe that your own consultant barrier
24
analysis report pursuant to the EEOC doctrine
25
stated that CTI institutions were not a barrier;
26
is that correct?
27
28
MR. TEIXEIRA:
Well, those are -- okay.
If I can try to
formulate your question slightly different we
VTranz

#:224
1
can probably give you a better answer.
would say that the evaluation was of the FFA
processes.
or their education or their preparations.
But I
We didn't evaluate the CTI schools
So we did a review of the FAA's hiring
process, and we're making changes to the FAA
hiring process, not to the CTI schools.
MR. PEARSON:
Okay.
Did the barrier analysis report itself
not say that CTI institutions were not a
10
11
15
barrier?
MR. MCCORMICK: Mike, I think that would be difficult for us to
12
answer that question because we don't have the
13
entire report laid out in front of us and we're
14
not able to go through and cite every portion of
15
it, so you have us at somewhat of disadvantage
16
there.
17
But I think I need to reinforce what Joseph
18
shared with you, and that is; this was an
19
internal review of our agency hiring and
20
placement practices in support of our technical
21
workforce of air traffic controllers.
22
made the policy changes to support the internal
23
agency process.
24
evaluate at all the CTI institution
25
(indiscernible) and continue to support those
26
institutions, and we continue to value our
27
relationship with them.
28
MR. PEARSON:
And we
We continue and we did not
Yeah, I appreciate that.
VTranz
I'm sure everybody on

#:225
1
16
the line appreciates that --
RECORDING:
After the tone please state your name.
MR. RESSITAR:
Wayne Ressitar.
RECORDING:
Thank you.
MR. PEARSON:
-- my question at all, and I assume based on
your answer, you folks haven't read your own
barrier analysis report, or you don't have it in
front of you.
MR. TEIXEIRA:
Okay.
Is that fair to say?
Mike, I think -- this is Joseph Teixeira
10
speaking.
11
question you're asking was outside of the scope
12
of barrier analysis.
13
other than to say it was beyond the scope of the
14
barrier analysis.
15
many different ways.
16
I can't answer yes or no,
I've tried to say that in
So the answer is it was evaluated by the
17
18
I think I've been very clear that the
barrier analysis.
MR. PEARSON:
Do you guys have, or did you use any
19
quantitative data -- or excuse me -- qualitative
20
and quantitative data regarding the success rate
21
of a typical CTI graduate to pass the AT-SAT
22
that goes into a facility versus an off the
23
street hired, or is this change based upon
24
diversity criteria only?
25
MR. TEIXEIRA:
This is -- these changes are made to hiring
26
process, not to the success rate of the
27
individuals.
28
selection, training and assignment of the
So we tried to improve our
VTranz

#:226
1
individuals.
broad review that started in 2011 with the IRT
report, and it's subsequently assisted by the
barrier analysis result.
It's an encompassing and very
And if I can have you formulate perhaps
your last question and let -- go to some other
folks and then you can rejoin if you -- that
would be very helpful.
17
MR. PEARSON:
Yeah, I have no problem releasing the floor to
10
someone else.
11
questions.
12
However, I do have many more
Why, since you have people that have been
13
qualified on the rolls, are you not going to
14
allow those students that invested thousands of
15
dollars, multiple years and, quite frankly, have
16
had at least a tacit promise by the FAA that
17
they would be given a look at being hired -- why
18
in the world when you have that backlog, those
19
people that are already qualified, would you
20
basically force them to back through a new
21
process instead of just trying the process --
22
the new process prospectively and notifying
23
people ahead of time it would be done that way?
24
MR. CANNON:
Those -- sir, this is Rickie Cannon.
Those
25
individuals in CTI schools and any other U.S.
26
citizens are not being disenfranchised of an
27
opportunity to apply to become an air traffic
28
control specialist.
That's what the February
VTranz
18

#:227
1
announcement is all about.
opportunity in February to apply and to be
considered along with any other U.S. citizen who
wants to apply for that job.
They will have an
And as Dr. Scott said, you know, the
biographical data and everything.
We'll take
into consideration their experience and
education and all of those related dimensions
associated within an air traffic controller.
So
10
those individuals are not being disenfranchised,
11
or held out of the process.
12
MR. PEARSON:
Well, I don't want -- I just want to do the
13
follow up to that there, Rickie, and I don't
14
want to quibble with you on that.
15
appears that you conducted this barrier analysis
16
after you already had an objective program of
17
assessment called the AT-SAT; is that correct?
But it
18
MR. CANNON:
I'm not following you there?
19
MR. PEARSON:
There are students that are on the rolls that
20
prior to the barrier analysis report either
21
being commissioned, or finalized, had already
22
passed an objective testing mechanism called the
23
AT-SAT developed by the FAA, correct?
24
MR. MCCORMICK: Oh, Mike, this is Mike McCormick again.
25
want to be able to do is that we have a
26
multiyear improvement program to correct a
27
multiyear problem and this goes back beyond just
28
the 2012 vacancy announcement.
VTranz
What we
We have vacancy

#:228
1
announcement lists that go back several years
that we have not hired from.
19
And so we need to go back through all of
these lists and actually purge all of them going
several years worth of lists, not just the CTI
list, in order to get the pool of candidates
through the new vacancy announcement that we
intend to publish in February.
So we've had a lot, a lot of discussions.
10
This is not something that we've taken lightly
11
by any means.
12
cons here within the agency, and a lot of robust
13
discussion around it, and we have decided to go
14
on with this policy decision and wanted to spend
15
some time today to communicate, one, that
16
decision and be able to give some of our
17
rationale behind it.
We've weighed a lot of pro and
18
So with that, Mike, I think we'd want to
19
turn it over to the next questioner, and if I
20
could answer a question --
21
MR. PEARSON:
Mike, actually, what I'd like you to do is
22
answer yes or no, not, quite frankly, not the PR
23
line.
24
the barrier analysis report, either commissioned
25
or finished after you already had an objective
26
assessment and process, the AT-SAT?
27
question?
28
Yes or no, was the disparity analysis --
That's my
MR. MCCORMICK: Mike, the AT-SAT has been in place for many
VTranz

#:229
20
years so you know the answer to that question is
that yes, we commissioned the (indiscernible)
the program had already been in place.
MR. PEARSON:
Right, and after the assessment, the AT-SAT had
been implemented, you're now changing the
characteristics of how you're going to hire
people by basically stating you have to reapply,
even though you're on the rolls to be hired, you
have to through this new process.
10
That's all I want to know, is that how
11
you're going to proceed?
12
been in the queue basically is going to have to,
13
again reapply one more time for the jobs they
14
believe they had already applied for and were
15
queued for; is that correct?
16
MR. MCCORMICK: Mike, if you're aware of the history of the
17
18
Everybody that has
hiring within air traffic control profession -MR. PEARSON:
I worked for the FAA 27 years, sir, in all
19
different types of options, so I'm very aware of
20
not only the --
21
MR. MCCORMICK: As you --
22
MR. PEARSON:
23
MR. MCCORMICK: -- as you're aware --
24
MR. PEARSON:
25
MR. MCCORMICK: We've made a lot of changes over the course of
-- hiring process (indiscernible) --
-- FAA.
26
those 27 years that you were within the agency,
27
and we've made a lot of changes to the hiring
28
program for air traffic controller profession,
VTranz

#:230
1
including going from several different testing
mechanisms, and finally evolving into the AT-
SAT.
and every time we have revised the process we
have, in fact, employees reapply into it.
21
This is another evolution of that process,
So this is not something that we're doing
either arbitrarily or capriciously in this event
or past events over the course of that time.
So again, and what I want to ask for future
10
questioners if you could identify what
11
organization you're also representing just so we
12
can have a full picture of who we're talking
13
with, that would certainly help us out a lot
14
too.
15
16
So thanks again, Mike.

THE OPERATOR:
17
Next question is from Victor Hernandez

(phonetic).
18
MR. HERNANDEZ: Victor Hernandez from Miami-Dade College.
19
UNID MALE:
20
MR. HERNANDEZ: (Indiscernible) -- just to make sure the
Hello, Victor.
21
students that have received a tentative offer
22
letter, they do not have to reapply; is that
23
correct?
24
UNID MALE:
That is correct.
25
MR. HERNANDEZ: Okay.
Now, as far as the work experience that's
26
listed on there, it's very vague, and I know
27
there's a combination of the years of schooling
28
and work experience.
Can you go over like what
VTranz
22

#:231
1
kind of work experience are you looking for?
know it says -- using the word "progressive,"
but it's still very vague.
that a little bit?
MR. CANNON:
Yeah, this is Rickie Cannon.
If you can clarify
Nothing is
changing from how we have looked at that.
Basically, if there is any work experience
there, the way this is going to be looked at is
the way that it has always been looked at.
If
10
there's work experience there, pretty much of
11
any kind, it is going to be creditable toward
12
the general experience requirement.
13
I understand.
And I've heard of two
14
different things.
15
when the student has to go online and apply, or
16
does it open 10 days earlier and closes on that
17
February 10th?
18
MR. CANNON:
Is February 10th, is that
This February 10th is a projected date.
The
19
Agency will be putting out quite a bit of
20
communication ahead of the announcement actually
21
opening.
22
watch for some of that messaging, or go on USA
23
Jobs regularly.
I would recommend that you, you know,
24
You know you can also go on USA Jobs and
25
actually post in your profile the ability for
26
them to even send you transmittal when a
27
particular kind of an occupation is announced.
28
So we are projected for February 10th, but
VTranz

#:232
1
2
again that's a projection.

So it is February 10th, and is it true
that may be open like for 10 days then it'll
shut down at that time?
MR. CANNON:
6
7
10
And you're looking to hire 3,000 with
that; is that accurate?

MR. CANNON:
Pardon?
MR. HERNANDEZ: You're looking to hire like 3,000 off of those
11
12
Yes, we are projecting a ten-day announcement -open period.
8
9
23
within those 10 days?

MR. CANNON:
No.
No, we are not looking at any particular
13
number of hires.
14
to initiate and establish a new inventory, and
15
our customer organization -- the air traffic
16
organization, will decide based on what their
17
needs are, how many people are selected from the
18
inventory.
19
20
All right.
That announcement will be used
Thanks.
ask you the next question.
I let somebody else

Thanks.
21
MR. CANNON:
Thanks, Victor.
22
THE OPERATOR:
Next question is from Kevin Kuhlmann.
23
MR. KUHLMANN:
Hello, this is Kevin Kuhlmann from Metropolitan
24
State University at Denver.
25
taking the time to answer questions.
26
that everybody's a little tense over this and
27
hopefully we keep everything very civil.
28
Appreciate you
I know
My question is, I don't quite understand as
VTranz

#:233
24
far as you say you want to maintain and you
cherish your relationship that you have with the
CTI schools.
that relationship is?
have by being a CTI student in the application
process?
MR. TEIXEIRA:
What do we tell our students that

What advantage do they
Well, I don't know -- this is Joseph Teixeira.
I don't know that there were ever any advantages
in the application process.
10
I believe people gain greatly from their
11
passion to aviation, and from their education
12
that they get at CTI schools, and that that
13
knowledge and that passion is reflected in test
14
scores and in their experience as they apply.
15
MR. KUHLMANN:
Well, they did have a distinct advantage of
16
going to the CTI school, we had -- the FAA told
17
each of the CTI schools your curriculum must
18
contain XYZ information.
19
commitments from the FAA of what we had to do.
20
The -- so that we had
And the FAA had commitments from us that we
21
would do certain things, register the student,
22
recommend the student to the FAA.
23
were recommended we made it very clear that that
24
was not -- there was no promise of employment,
25
there was only a promise for the opportunity for
26
employment with the FAA.
27
28
When they
And that because you were recommended that

you would be -- as long as you completed your
VTranz

#:234
1
USA Jobs and the previous website, as long as
you completed that all properly when the
announcement came out -- that you would be
provided the opportunity.
And it was a very distinct advantage, and
it is the reason these students came to CTI
schools because they gained that advantage.
Now, basically, they're on even footing with
everyone in the public.
So I don't understand
10
why you don't understand that the whole reason
11
of the CTI program was advantageous to the
12
student, and that's how they were drawn to the
13
program.
14
25
MR. TEIXEIRA:
I wouldn't disagree that the mechanics provided
15
that image.
16
guarantees in that process, and the really
17
guarantee thing that they would've got out of
18
this relationship is a curriculum that we
19
thought would be advantageous for their learning
20
and would make them competitive in a competitive
21
process.
22
there was no guarantee.
23
now.
24
MR. KUHLMANN:
But as you've stated there were no
But in your own -- in your own words

There is no guarantee
Well, there was no guarantee of employment and
25
we had to make sure everyone knew that.
26
there was -- here's another advantage was that
27
they would not have to go through AT basics
28
course, the six-week initial course.
VTranz
But
Is that

#:235
1
2
still -- is that off the table now?

MR. TEIXEIRA:
No, that is not off the table now.
Those
changes once you are hired are under review.
I'm not prepared to make a statement on that
now, but --
MR. KUHLMANN:
Okay.
Do the CTI schools have to follow the
same procedures that they -- as in do have to
recommend students anymore, or do they just go
when there's an announcement and apply like
10
11
26
everyone else?
MR. TEIXEIRA:
As I described to you earlier, we've made
12
significant changes to standardize the
13
application process, which will have significant
14
benefits to students and to the FAA.
15
One of the things that I highlighted, one
16
of the big complaints I get from schools when I
17
travel, is that people stay in the queue, people
18
with a tentative offering letter for 12 to 18
19
months waiting for a vacancy in the two states
20
that they applied, and by eliminating that,
21
we're able to process people as quickly as
22
possible through the academy and they will
23
placed in the next available vacancy.
24
So a lot of these changes from an
25
individual point of view will have pros and
26
cons.
27
make -- we're making, are just that, by
28
standardizing the process, and by putting
But our view is that the improvements we
VTranz
27

#:236
1
everybody through to the next available vacancy,
that we're actually helping everyone.
MR. CANNON:
And let me follow on -- this is Rickie Cannon.
Let me follow on to Joseph's point.
No, in --
for this announcement in February, there is no
need for a CTS (sic) student or any other
applicant to have a letter of recommendation.
They are applying under a U.S. citizen
announcement.
Certainly if they want to -- if
10
they want to have that in their resume, that's
11
fine as well as an attachment if they want to
12
send it.
13
preference or anything provided with that.
14
will be applying and competing with any other
15
U.S. citizen who wants to apply.

Okay.
But there is, you know, no additional

They
16
MR. KUHLMANN:
I understand that.
17
MR. MCCORMICK: If I could share just a little bit of background
18
on the advantage of a CTI university or program
19
for an applicant.
20
been a competitive applicant for hiring as a CTI
21
student, he will remain a competitive applicant
22
under this process, that does not go away.
23
If a person in the past has
So we definitely see a real opportunity for
24
the CTI schools to continue to provide the
25
education, the background, and as Joseph
26
mentioned, more importantly the passion for the
27
career field; that would create a pool of highly
28
qualified candidates who are applying to compete
VTranz

#:237
1
28
for these positions.
One of the things that we feel that is
highly critical and very important for us in
this applicant pool is that we have not had the
opportunity to do a general public hiring for
several years.
going to do that in February.
MR. KUHLMANN:
We need to do that and we are
Yeah, and a biographical question there.
Is
there a specific question that asks, did you
10
attend an AT-CTI university, or receive a degree
11
from an AT-CTI university?
12
UNID MALE:
13
There are questions that get at -- that get at

that.
14
MR. KUHLMANN:
Okay.
Thank you.
15
THE OPERATOR:
Next question if from Ed Mummert.
16
MR. MUMMERT:
This is Ed Mummert from Embry-Riddle
17
Aeronautical University.
18
don't think the question was specifically
19
answered.
20
enrollment information to aviation careers in
21
Oklahoma City?
22
recommendations once a student graduates?
23
MR. CANNON:
I was just curious.
Are we still going to be sending CTI
And are we going to be sending
This is Rickie Cannon again.
And there will be
24
no need to send that kind of information.
25
dont think we've actually formally said don't
26
because it's -- we're working through this
27
process.
28
for -- and all U.S. citizen open vacancy
It's kind of -- it is just -- but
VTranz

#:238
1
announcement there would be no reason.
The applicant themselves can put whatever
their credentials are in their resumes as they
apply, and those stand alone against all the
other competing employees.
MR. MUMMERT:
7
8
Okay.
So there's no need for us to enroll
students anymore; is that correct?

MR. CANNON:
9
10
Well, I don't know what you mean when you say no

need to enroll students?
MR. MUMMERT:
By enroll, I mean send all their information to
11
Oklahoma City, which is kept in a database
12
there, and then once they graduate we send in
13
the recommendation.
14
MR. CANNON:
15
No, I don't believe there will be a need to do

that.
No, sir.
Thank you.
16
MR. MUMMERT:
Okay.
17
THE OPERATOR:
Next question is from Felix Esquibel.
18
MR. ESQUIBEL:
Hello, I'm representing Western Michigan
19
University CTI, a couple of items.
20
you explain exactly then what you're not
21
changing, and that was the FAA and the AT-CTI
22
support?
23
getting from FAA?
24
29
Can any of
What kind of support will the CTIs be
And secondly, can you speak to a rumor, or
25
the impetus of this, as being a lawsuit that not
26
everyone can afford to go to college, and
27
therefore that created disparate impact?
28
MR. MCCORMICK: Hi, Felix, this is Mike.
VTranz
I can definitely

#:239
1
clarify for you that there is no litigation --
pending litigation, (indiscernible) that is
driving any decision in this process at all.
So this is strictly around our opportunity
to review our hiring, selection, placement
process to get our best pool of candidates for
the air traffic control profession.
MR. ESQUIBEL:
Okay.
Then can you speak to what kind of
support the CTI schools will be getting from
10
you?
11
need to have this program?
12
much what we're hearing is that the FAA no
13
longer needs the CTI schools?
14
30
MR. TEIXEIRA:
Because it seems to us that we no longer

So is that pretty
You're certainly not going to hear that from me,
15
Joseph Teixeira here.
16
schools are our key and variable intake of
17
qualified applicants to the FFA, not just the
18
air traffic organization.
19
typically part of aviation colleges, which there
20
about 105 in the United States, and we
21
absolutely rely on them as a source of education
22
and inspiration for people who want to join the
23
aviation career field.
24
I think that these
I think they're
So I clearly see a need for this program,
25
and we'll continue to support you in any way we
26
can, so let us know what kind of help you need.
27
28
MR. ESQUIBEL:
Well, we need some assurance that our efforts

are going to be continued as a partnership with
VTranz

#:240
1
the FAA, that's kind of what I hear and have
read with many of the emails.
31
One of the other questions that I do have
is that if the graduate's already taken the
AT-SAT and has passed that, but does not have a
TOL, wouldn't it just be more prudent to have
them then go back and just take the
biographical, rather than having them complete,
again to take the AT-SAT?
10
MR. CANNON:
Well, this is Rickie Cannon again.
We have made
11
the decision that because this is a new process
12
that for all those applicants who will have to
13
apply, again in February, they will go through
14
the complete process.
15
We certainly did have that discussion and
16
it was a very robust discussion within the
17
agency.
18
everyone will go through the same process.
19
MR. ESQUIBEL:
Okay.
But the decision which was made that
And one last question.
How many open to
20
the public vacancy announcements do you plan on
21
doing per year?
22
out about February 10th.
23
another one in 2014, or are you just planning
24
one per calendar year?
25
MR. CANNON:
I understand this one may come

Do you plan on having
I think there are several things that will drive
26
how quickly we'll have these kind of
27
announcements, the key one being demand.
28
Another one would certainly be the number of
VTranz

#:241
1
highly qualified applicants we get that are
available for selection.
those two things essentially.
32
So it's all driven by
MR. MCCORMICK: And, Felix, this is Mike again.
As part of this
process we can expect that the demand will be
there for at least the next three fiscal years
to do substantial hiring so that if, in fact,
the applicant pool and the selection pool is not
large enough after the February announcement,
10
you can expect additional announcements will
11
come out.
12
MR. ESQUIBEL:
Okay.
And when do you plan on starting the
13
academy up again?
14
MR. MCCORMICK: This past Monday.
15
MR. ESQUIBEL:
16
Oh, very good.
All right.
Thank you, good
luck.
17
MR. MCCORMICK: Thanks, Felix.
18
THE OPERATOR:
Next question is from Florida State College.
19
MR. FISCHER:
Yeah, hi, this is Sam Fischer, I think one of
20
the core things we've been driving at here is,
21
again there was a preference in the past for CTI
22
students because they had a separate hiring
23
announcement, thats obviously been done away
24
with.
25
So my question is on this since the --
26
you're saying the automated process will only
27
evaluate their biographical questionnaire, their
28
AT-SAT score, and their veterans preference, and
VTranz

#:242
33
you've mentioned several times that CTI schools
will provide students that are inspired.

How will the biographical questionnaire
3
4
reflect either aviation education, aviation
experience, or their aviation inspiration?
MR. SCOTT:
As I mentioned, there are -- this is John Scott.
There are questions on the biographical
questionnaire that touch on those very issues.
MR. FISCHER:
10
11
public?
MR. SCOTT:
12
13
Will that biographical questionnaire be made
No, it's a test.
Like any test is has to remain
secure.
MR. FICSHER:
All right.
You mentioned the -- you're purging
14
the current inventory, and that I understand
15
you'll do these, you know, announcements, as
16
needed.
17
inventory to remain valid?
18
MR. CANNON:
How long do you plan for the new
Again, as both Mike McCormick and I in answer to
19
the last question, it'll depend on the number of
20
people we get into the applicant pool and
21
inventory in February, and it will, of course,
22
depend on the amount of demand.
23
So we don't normally put a date stamp on,
24
you know, an inventory on when we will end it or
25
when we will add to it.
26
circumstances at hand.
27
28
MS. BOSTICK:
It's all driven by the
This is Carrolyn Bostic, I'm sorry I'm joining

this call late.
I'm the head of HR for the FAA
VTranz

#:243
1
34
and I'm traveling.
But I also wanted to add to Rickie's
comment, earlier he said the process is
iterating and it is.
of the next general public announcement could be
if we make some tweaks to the process and then
we would have people go through the new process.
And we will, as we make these changes to the
process, we will ensure that we keep you
10
11
And so one of the drivers
informed as appropriate.
MR. FISCHER:
Yeah.
Okay.
So I guess, again and it comes to
12
the point I think a lot of us are asking, I
13
understand that you feel that we provide
14
students with a broad aviation education,
15
etcetera.
16
aviation education -- or any aviation program,
17
you know, it would appear they be just as likely
18
to be hired under this process for that matter
19
if they go to any education or aviation
20
experience.
21
traffic programs at these colleges?
But if a student were to go to any
Why would we continue to offer air
22
MR. TEIXEIRA:
Joseph Teixeira, really that is your decision.
23
MR. FISCHER:
Well, I understand that, but if the students see
24
no benefit.
25
couldn't get hired as a nurse, why would I go to
26
nursing school?
27
28
MR. TEIXEIRA:
If I went to nursing school and I
Well, that's the jump that we're not making,

right?
So if you go to an air traffic school
VTranz

#:244
35
and you can indeed apply for an air traffic job,
so I don't see the connection, we're not
preventing anyone from applying.
Offering a diversity of academia or
curricula is what universities do.
decided that you no longer want to offer nursing
or engineering that doesn't mean that those jobs
won't be there, just that you won't be offering
that opportunity.
10
If you
The ATC jobs --
11
MR. FISCHER:
All right.
12
MR. TEIXEIRA:
-- will absolutely be there, and as Mike
13
explained we see that that demand is going to be
14
high in the next two to three years because we
15
have a lot of retiring.
16
MR. FISHER:
Certainly, certainly.
I guess without seeing
17
the questionnaire it's impossible to quantify
18
the benefit that a student would get from an
19
aviation program.
20
MR. TEIXEIRA:
Is that fair to say?
Again, people go to these schools for more than
21
passing a test or getting a job as you -- as has
22
been mentioned before, that guarantee was never
23
there.
24
We hope that that education that you're
25
providing will absolutely be helpful for them in
26
the process of hiring and in the process of
27
training in their career.
28
test investment that they're making, at least
VTranz
This is not a one-

#:245
1
2
36
I'm hoping that that's the case.

MR. FISCHER:
Well, certainly.
But again, you're correct
there is no guarantee for hire.
guarantee for application because there were CTI
announcements.
There was a
The students could if they passed their
program apply to a job posting that in the past
was not available to general public.
there's no guarantee of expect -- or expectation
10
of hiring, there was a guarantee of application.
11
So while
At this point, you seem to be saying that
12
anybody can apply, and without know whether it's
13
weighting -- or what the weighting might be, air
14
traffic experience is just as good as aviation
15
administration experience, or gate agent
16
experience, or pilot experience.
17
So I can't see a benefit to having an air
18
traffic program because without knowing the
19
weighting, the students are just as likely to
20
get any degree and be hired, so again what --
21
you know, that's my point.
22
MR. TEIXEIRA:
I think we are in agreement that everyone is
23
able to apply now, and we have been denying that
24
opportunity to a lot of people in the past few
25
years.
26
offer these positions to the general public in
27
addition to the CTI schools.
28
MR. MCCORMICK: So, Sam, this is Mike again.
And it's, as Mike explained, we must
VTranz
I just want to

#:246
37
reinforce that in the past when we had general
public announcements, along with CTI
announcements, the applicants pool from that
into our selection panel that we picked from,
both of those, so there wasn't any preference
given to we need to collect all the CTI students
first before we can do general public hires.
They actually got put together into a pool of
candidates that are central selection panels
10
went from.
11
So I think there's a perception that
12
because there was the CTI application pool
13
that's different from the general public
14
application pool, that there was an advantage to
15
the CTI applicant, but, in fact, when we had the
16
multiple announcements out there is not.
17
The advantage has been for the past couple
18
years, however, because we have not had a
19
general public hire now since, but the CTI
20
applicant pool was the principal pool that we
21
were selecting from.
22
and that is the advantage that is going away.
23
All other advantages still remain.
24
MR. FISCHER:
All right.
That is the sole advantage
One final question, on your web page
25
it still lists Air Traffic Collegiate Training
26
Initiative Program.
27
required to be included in that?
28
MR. CANNON:
What qualifications are
The basic qualifications for the announcement,
VTranz

#:247
38
again in February, will be the three years of
progressive responsible experience that
demonstrates the potential for learning and
performing air traffic control work.
the basic qualification upon which those
individuals will be evaluated, along with then
them having to go through the biographical data.
MR. FISCHER:
No.
That'll be
My point is that we are a CTI school.
Other schools are not CTI schools.
If there is
10
no requirement for us to provide student lists
11
or recommendations for hire, what constitutes
12
being in the CTI program?
13
In other words, what would allow a college
14
to say we are a part of the CTI program if we're
15
not giving you recommended students, if we're
16
not providing lists of our enrollees, what makes
17
us part of the CTI program?
18
MR. TEIXEIRA:
Sam, a lot of these lists were simply to keep
19
track of administrative controls such as AT-SAT
20
scores and all that, again no advantage was
21
granted there.
22
We had an initial qualification and
23
application for CTI schools, we haven't taken
24
anybody since.
25
the list wanting to be a CTI school.
26
evaluating that process and how we might take in
27
additional CTI schools.
28
There are many, many schools on

We
So that's really where we are now.
VTranz
You

#:248
39
know what the requirements were at the time of
application to become a CTI school.
designation, it is used by schools, it is used
in their marketing, and it used in part of their
curriculum, so all of that will remain in place.
It is a
MR. FISCHER:
All right.
Thanks for hearing the questions.
MR. TEIXEIRA:
Thank you.
THE OPERATOR:
Sharon LaRue.
MS. LARUE:
Hi, this is Sharon LaRue from Alaska, Anchorage,
10
and one of our questions is, did you intend to
11
exclude students with two-year degrees?
12
doesn't seem to meet your criteria right now,
13
and a lot of these students do have the two-year
14
degree.
That
15
And if that was the intent, I don't know --
16
we're wondering how that is supposed to increase
17
your diversity when community colleges and two-
18
year degrees are generally a more diverse
19
population?
20
MR. CANNON:
No.
This is Rickie Cannon.
There is no intent
21
to not include any school or degree.
22
if someone has a two-year degree and experience,
23
we can combine those to meet the three years of
24
experience that is required.
25
Basically,
So it's not necessarily that the two-year
26
associates would be of any less value than a
27
four-year bachelor's degree.
28
looking for a total of three years, and we can
VTranz
We are simply

#:249
1
2
40
combine both experience and education.

MS. LARUE:
Okay.
Well, I don't think that's probably very
clear to people on the outside looking in and
they might not feel that they could apply based
on that.
The other thing I have here is I found the
questions from Scott that he wasn't able to
access.
another CTI posting?
10
MR. CANNON:
So can I ask is there ever going to be
Well, that's a certainly a question that I don't
11
think anyone can answer now.
12
the Agency is moving forward with the process, a
13
single nationwide announcement, and until a
14
different decision is made, if it ever is, then
15
that will be the way forward with how we will
16
announce these jobs in the future.
17
MS. LARUE:
Okay.
What I can say is
Why would the pool of candidates from the
18
current CTI and BRA candidates that -- why has
19
that been disregarded, why is -- why can they
20
not -- I know they can under -- I understand
21
they can apply again, I understand that, but why
22
are we not just rolling them in too, and
23
assuming that they already have applied, since
24
they obviously have?
25
MR. CANNON:
Are you referring to those individuals who are
26
in the inventory without a tentative offer
27
letter?
28
MS. LARUE:
Correct, the 3,000 to 3,500 that we've heard
VTranz

#:250
41
that are awaiting jobs and have been waiting for
quite a while.
over into, assuming that they have already
applied?
MR. CANNON:
Why are we not just rolling them
Well, they weren't necessarily awaiting jobs,
they were awaiting at some point to be
recovered.
MS. LARUE:
Correct.
Yes, they were awaiting some type of
tentative job offer.
I understand they hadn't
10
had anything yet.
11
applied for the job once, why are they being
12
made to reapply?
13
MR. CANNON:
But, obviously, they all
Well, one, it is a different process that they
14
are having to go though than what the legacy
15
process was.
16
MS. LARUE:
17
So they all do need to reapply though, we
can tell them that.
18
And then another question from the CTI list
19
20
Okay.
is how will the automated -MR. CANNON:
Those individuals who get a letter from us to
21
say that their resumes are being closed out in
22
the current inventory will be told if they
23
want -- if they still have interest, they can
24
reapply in February, so they'll get a letter
25
from us.
26
MS. LARUE:
27
28
They will, everyone on the list will get that

letter.
MR. CANNON:
Okay.
Those individuals whose resumes are being closed
VTranz

#:251
1
2
out.
MS. LARUE:
Okay.
42
Yes, ma'am.
Would applicants that failed to be
recommended for employment by a CTI school, or
an applicant that has already taken and not
passed the AT-SAT, are they eligible for
employment under the new system, so somebody who
went through our program and did pass, or
somebody who did pass, but received less than a
70, are they eligible for employment under the
10
11
new system?
MR. CANNON:
Yes, any U.S. citizen will be able apply for
12
employment and they will -- you know, if they
13
are subsequently selected, then they will go
14
through the normal course, or process of
15
suitability, security, and all those things that
16
everyone goes through to determine their fitness
17
to actually be employed.
18
MS. LARUE:
Okay.
Obviously, this change was months in the
19
making.
20
at least and briefed on the changes until such a
21
late date when there's really not a lot of time
22
for us to react, and a lot of time to brief our
23
students especially over the holidays?
24
Why were the CTI schools not brought in
What was the reason behind not including
25
the schools in at least what was going on if --
26
as -- not as part of the decision making
27
process, they should have been informed.
28
That is part of the agreement we had with
VTranz

#:252
1
you is that you are supposed to keep us
informed, and that has not been done in this
case, so we're wondering what the reason for
that was?
MR. TEIXEIRA:
Okay.
I can accept that you don't think that
this notification is timely.
accept that (indiscernible).
43
We'll certainly
I can tell you that we are holding no such
teleconference with anyone else.
We're not
10
sending letters to anyone else.
11
courtesy advanced notification of the policy
12
changes that the FAA has made.
13
This is a
Everyone else will just have to respond the
14
vacancy announcement.
15
So I think we are indeed taking the time to
16
try to explain to you, give you the background,
17
explain why we're here, and try to work through
18
any of your concerns to the extent that we can
19
help explain the policy changes so.
20
And the reason we didn't do it before then
21
is because we didn't have the policies decided
22
on.
23
having this teleconference.
24
MS. LARUE:
As soon as they were decided on we're
But we got announcements way back at the
25
beginning of December that this was already
26
decided on because we -- there were
27
announcements coming out that this was already
28
going.
And we've all been responding to
VTranz

#:253
1
students who are admittedly very upset about
this process, and my response has been I don't
know until now.
44
We do have more information now, and I
thank you for taking the time to do that.
can't honestly -- I mean I can't under -- I find
it hard to understand how you guys would make
that major policy change with only six weeks
notice.
10
MR. TEIXEIRA:
But I
Look, what I'm saying is that clearly we've been
11
working on it.
Many people have been working on
12
it for months.
In fact, I quoted you that we've
13
been working on it back to 2011.
14
What we didn't have is a corporate decision
15
and a process that we could withstand your
16
questions.
17
So any rumors that you may have heard were
18
just that because the people who made these
19
decisions are in this room and on this
20
teleconference.
21
yesterday we were working through some policy
22
questions on just exactly what we're going to do
23
with this group, and that group and others.
24
And I assure you that even
So as Carrolyn Bostick said earlier this
25
isn't -- it is a process that changes are
26
coming, and we're moderating it, and it's on the
27
basis of how quickly we can make them, how
28
quickly can the Agency adapt to it, how quickly
VTranz

#:254
45
can the IT department adapt to it.
getting it as early as we are coherent enough to
explain it to you.
MS. LARUE:
Okay.
So you are
Another one from the list is, without
geographical preference, how will the applicants
be placed at facilities?
we're going to need to tell our students if they
do accept a job.
This is something
Are they going to go there?
At what point
10
will they know where they're going?
11
they accept a job and it's just a place they
12
cannot move to, and once they get there to the
13
academy and they say your job is here, and they
14
can't go there, what are their options going to
15
be?
16
Those are going to be questions we all have
17
18
What if
from students.
MR. TEIXEIRA:
Yeah, but before we answer that, I think
19
Carrolyn Bostick has been trying to get into
20
this conversation.
21
something Carrolyn?
22
23
24
MS. BOSTICK:
Did you want to say
Yeah, I do and I wanted to say that your

comments are spot on.
And I wanted to also say to the CTI schools
25
that we are extremely respectful of your
26
concerns about the changes to the process.
27
We are trying to make sure that at this
28
point we're very transparent, and we need to
VTranz

#:255
46
keep talking.
asked us that we still need to think about, but
we get it, and we're trying to -- this has been
a very complex, difficult process issue to
tackle.
earlier we would have.
There are some things that you've
If we could have communicated it
Every day we're working on this because we
really are trying to get it right.
But please
remember -- please keep in mind that we get how
10
complex and how difficult a change this is for
11
you and your constituents, we get it.
12
MR. TEIXEIRA:
Regarding your question, Sherry, (sic) so a lot
13
of the assignment processes are still being
14
worked out, but here's what I do know for sure
15
now.
16
Individuals will be placed on the basis of
17
the specialty that they graduate in, and on the
18
basis of where the vacancies exist at the time.
19
So when folks apply and they're offered a job,
20
they're going to be offered new conditions,
21
meaning that they have to agree to be -- to
22
accept a position anywhere in the United States
23
at the direction of the FAA.
24
If they are not prepared to do that then
25
they won't be a successful applicant.
26
they are --
27
28
MS. LARUE:
But if
They have to make that decision before they're

assigned to attend the academy or not, or are
VTranz

#:256
47
you waiting until they're done with the academy
to offer that --
MR. TEIXEIRA:
They will have --
MS. LARUE:
-- because they'll want to know, you know, when
is that decision going to be made.
not want to accept that, are they not going to
be allowed to attend the academy?
MR. TEIXEIRA:
Yeah.
If they do
That is correct, they have to accept that
condition as a condition of their employment,
10
that they -- if they can -- they can be -- the
11
offer is contingent on them accepting a
12
placement anywhere in the United States.
13
If they're not prepared to accept that
14
condition, then they'll have to decline the job.
15
MS. LARUE:
Well, what if they get done and they won't
16
accept it, and they are not allowed to quit
17
then?
18
can't?
19
MR. TEIXEIRA:
Okay.
Are they signing something that says they
Now, people can always quit any job, I
20
mean whether it's through the FAA or any other
21
company.
22
honest and transparent like Carrolyn said.
23
But I'm saying is we are going to be
Ahead of time you're going to tell folks
24
you're being picked for this job, you're going
25
to spend this much time at the academy, based on
26
your graduation scores and abilities you will be
27
offered a job anywhere in the United States.
28
They will be at the -- where the FAA has its
VTranz

#:257
1
2
48
highest need.
MR. CANNON:
And the -- and then also what Joseph is saying,
the vacancy announcement itself will provide
applicants some information with regard to that.
Again, it's a nationwide announcement, and I
believe there will be verbiage in there that
says you may be placed ultimately in any
location in the United States that the FAA may
have a mission need (indiscernible).
10
So there'll be some information, so as even
11
if they decide to apply they will be making an
12
informed decision that they may ultimately be
13
placed in a location that may not be their
14
preferred location.
15
MS. LARUE:
Okay.
And you said they would be placed on the
16
basis of the specialty they graduate in.
17
they going to be offered an academy spot in
18
route, or an academy spot in terminal, or is
19
that decision going to be made at the academy as
20
well?
21
MR. TEIXEIRA:
22
So are
Well, for the February lists that offer will

come before they go to the academy.
23
MS. LARUE:
Okay.
Okay.
Anything else?
24
MR. MCCORMICK: So, Sharon, this is Mike McCormick again.
We've
25
been working very closely with NATCA the labor
26
union who represents the air traffic
27
controllers, and certainly they understand and
28
appreciate the concern of facility placement
VTranz

#:258
1
49
post academy graduation.
And we've all have had to work in the past
knowing that folks are getting assigned to where
they don't want to be.
hard to come up with a process that will, number
on, accommodate the need of the Agency in terms
of the priority of placement, at the same time
be sensitive to our employees.
So we're trying very
There's no guarantees they're going to get
10
where they want to go.
11
and our needs are varied.
12
lot of flexibility initially in this process.
13
UNID MALE:
14
15
There's going to be a
Will there be a preference for terminal or in

route?
MR. MCCORMICK: They have no preference at this time for either
16
17
But our needs are great,
one, we have great need involved.

UNID MALE:
18
So the applicant will no opportunity to give a

preference?
19
MR. MCCORMICK: No, not in this February iteration.
20
MS. LARUE:
21
MR. MCCORMICK: Thanks, Sharon.
22
THE OPERATOR:
23
Okay.
Thank you.
Next question is from Melissa Denardo

(phonetic).
24
MS. DENARDO:
A lot of the questions -- can you hear me?
25
MR. MCCORMICK: Hi, Melissa.
26
MS. DENARDO:
Hi.
Okay.
A lot of the questions have been
27
answered, you know, previous.
28
you put the narrative out the other day it
VTranz
I know that when

#:259
50
stated that the students needed a four-year
degree.
wrong, that they could get a two-year degree and
one year of experience as long as it equals
three years; is that correct?
MR. CANNON:
Now, I'm hearing, correct me if I'm
Yes, I'm looking at, I believe from the CTI
letter, and I think what it says is applicants
must have at least three years of progressive
responsible work experience, a four year degree,
10
or a combination of the two.
(Indiscernible) to
11
that or a combination of the two.
12
We can always combine education and
13
experience, so if they have a two year degree
14
and one year of work experience, that equals
15
three.
16
four-year degree would equal three.
17
old (indiscernible) qualification standard.
We're just trying to get to three or a

That's the
18
UNID MALE:
They don't actually need a degree?
19
MR. CANNON:
Right, and they don't need a degree.
20
UNID MALE:
Right, they can have --
21
MR. CANNON:
There is no degree required.
22
UNID MALE:
They can have two years of college and one year
23
of professional work --
24
MR. CANNON:
Exactly.
25
UNID MALE:
-- and still meet the criteria.
26
MR. CANNON:
Exactly, that's my point, we can combine the
27
28
two.
MS. DENARDO:
So what I understand then is without seeing the
VTranz

#:260
1
criteria, I don't know if that education --
could I be in any field?
elementary education, and I could have a two-
year or a four-year degree in elementary
education and I would receive --
So I could be in
MR. CANNON:
Exactly.
MS. DENARDO:
-- preference and I could apply for this job?
MR. CANNON:
There is no positive education requirement,
ma'am, for this occupational series, period.
10
This is simply, if there is education, we can
11
use that to substitute it for the experience
12
requirement so, yes.
13
yes, that education or that degree could be in
14
anything.
15
51
MS. DENARDO:
To answer your question,
Have you built in any preference, any
16
preferential point for the CTI schools that have
17
been supplying you faithfully all these years?
18
MR. CANNON:
No, there is no preference.
No, ma'am.
19
MS. DENARDO:
I want you to know that since you put the
20
announcement out we have had students
21
dis-enrolling so this has -- will have a great
22
affect on the community college system that
23
these students enrolled.
24
you've taken that into consideration, but it
25
will have a tremendous affect.
26
MR. TEIXEIRA:
I don't know whether
So I don't know what you mean by putting an
27
announcement out.
We really have not announced
28
these changes to anyone other than to CTI
VTranz
52

#:261
1
schools, and you received that for the first
time on the 30th of December.
announcement.
There's been no
MS. DENARDO:
But our students are aware.
UNID FEMALE 2: Because they have controller (indiscernible) --
MR. ROMEO:
But, hey, this is Corkey from CCBC just
following up on Melissa's comment.
are aware.
Our students
I've had parents who are controllers call
10
me already and ask me what our plan forward is,
11
and I was waiting for this phone call to figure
12
that out.
13
I think I have a halfway decent idea.
14
got a couple questions though if you'll indulge
15
me just a second.
16
the more education you have on your biographical
17
survey is a more better idea, in pilot talk,
18
more better idea?
Would it be fair to say that
19
MR. MCCORMICK: Hi, Corkey, and you're from Beaver County?
20
MR. ROMEO:
21
MR. MCCORMICK: Okay.
Yes, sir.
Thanks.
I think I've answered some of
22
that earlier in the biographical questions.
23
That we definitely -- they have developed
24
questions that are going -- that are designed to
25
dig out --
26
MR. ROMEO:
27
MR. MCCORMICK: -- any experience and education, and specific
28
Right.
aviation relevant education that applicants
VTranz
53

#:262
1
would have.
specific questions, I think you can feel
comfortable that the CTI education will
certainly be an advantage through that process,
it's a singular advantage.
So without giving away those
In other words, you don't check a box and
then suddenly hear (indiscernible) the list.
MR. ROMEO:
MR. MCCORMICK: You are competing against the pool.
10
MR. CANNON:
Right.
Right.
Yeah, you are competing against the pool.
And I
11
just want to go back to what Joseph said in his
12
opening.
13
necessarily by veteran's preference because
14
veteran's preference is a legal entitlement.
15
as we issue those referral lists those referral
16
lists will be by veteran's preference.
17
Those referral lists will be
Michael makes a good a point from what Dr.
18
Scott has talked about with regard to how the
19
biographical data is laid out.
20
So
MR. ROMEO:
So -- okay.
So put it in pilot talk because
21
you -- most of you guys know I'm a pilot, so
22
it's got to be simple.
23
pointed, for lack of a better term, education
24
you have -- you may have, I emphasize, may, have
25
an advantage, and I hate to use that word, a
26
better look at the biographical -- a better shot
27
at the biographical survey/test?
28
MR. SCOTT:
And that the more
Well, Corkey, you have to look at it in
VTranz

#:263
1
54
totality, it's a combination of education --
MR. ROMEO:
Right.
MR. SCOTT:
-- and experience.
So a person with a strong
education and strong experience is going to have
more of an advantage over someone who just has
education.
MR. CANNON:
That's right.
MR. SCOTT:
Okay.
MR. ROMEO:
Okay.
10
MR. SCOTT:
-- you have to -- it helps us to develop the
11
12
Right.
So --
complete picture of the applicant.

MR. ROMEO:
Roger that, understood.
Now, to follow on
13
another question that was asked before on the --
14
when the candidates get to Oklahoma City and
15
they're -- and you're getting ready to make
16
assignments in either terminal or in route,
17
depending on how the student does, so to speak.
18
Are they dream sheet type thing, going back
19
to military words, where they can, you know, say
20
this is where I'd like to go, if you can't
21
handle that then -- you know, then it's up in
22
the air type stuff?
23
MR. MCCORMICK: Corkey, we're still working out the details on
24
that, but --
25
MR. ROMEO:
26
MR. MCCORMICK: -- what we envision is probably going to be
27
28
Okay.
something like that.

MR. ROMEO:
Okay.
VTranz

#:264
1
MR. TEIXEIRA:
55
Yeah, I mean we don't have any details, but we
just -- as it was mentioned before these are our
employees, we want to take care of them we're
going to get them as close (indiscernible) --
MR. CANNON:
Yeah, and I believe, Corkey, we haven't -- we
don't have the final, final design of the
announcement.
is a nationwide announcement, the applicants
will be asked if have a preference for a couple
10
of two or three locations, but that's only just
11
to give a signal to ATO that, you know, there is
12
a signal there.
13
prevent placement anywhere the ATO believes to
14
be --
15
MR. ROMEO:
16
But I believe, even though this
But it certainly will not
Yeah, the been there, done it, got the t-shirt

type stuff.
17
MR. CANNON:
Yes.
18
MR. ROMEO:
Have -- do we have any idea of -- or do you guys
19
have any idea, not me, on the success rate at --
20
that you believe your success rate is going to
21
be at Oklahoma City going this route as opposed
22
to the past old routes?
23
In other words, are we going to -- do you
24
think we're going to have a higher success rate
25
going through Oklahoma City even like with -- I
26
know it's -- you're guessing here, but I'm just
27
curious.
28
MR. TEIXEIRA:
So will the medicine be worse than the disease?
VTranz

#:265
1
MR. ROMEO:
Yeah, I guess.
MR. TEIXEIRA:
We don't know.
We don't know.
56
So we have --
again, as I mentioned earlier, every time I went
out to a CTI school the number one issue is, you
guys take too long to get these people hired, to
get to the academy.
for a year or two.
They've been out of school

This is a problem.
We have to retrain them.
Our own
facilities are complaining about it, so we
10
have -- we are forced to send people to the
11
wrong location because we made them a promise
12
two, two and a half, three years ago even though
13
that we didn't have the predicted retirement at
14
that location, and now we have overages in one
15
place and not enough people in another.
16
So we have a lot of challenges that need to
17
get addressed.
Getting some flexibility to do
18
initial assignments will help us.
19
Now, clearly, we are going to lose some
20
people who are not willing to move away from
21
their hometown.
22
a small number.
23
MR. ROMEO:
We are hoping that that will be
And that's the nature -- yeah, that's the nature
24
of the job, and I think most of us who have been
25
in it understand that, so that -- I'm just
26
trying to get -- because I've got to start
27
briefing students on Monday and Tuesday because
28
we're starting new semesters, and so I'm trying
VTranz

#:266
1
to formulate that briefing, you know, as we're
speaking here so.
MR. SCOTT:
4
5
57
But we do predict that this process will

increase the success rate.
MR. MCCORMICK: All right.
And one thing I want to reinforce
what Dr. Scott just said; we're not looking at
the success rate at -- through the academy.
This process that Joseph initiated back in 2011
is going to improve our success rate of our
10
employees through the facility, reaching the
11
CPC level and then beyond.
12
MR. ROMEO:
13
Yeah, and in reality, I apologize because that's

what I meant.
14
MR. MCCORMICK: All right.
Okay.
Thanks, Corkey.
15
MR. ROMEO:
Yeah, that's what I meant.
16
THE OPERATOR:
Next questioners name was not recorded, if you
All right.
Thanks.
17
prompted for a question, your line is open,
18
please check your mute button.
19
UNID MALE:
Maybe --
20
MR. LATHAM:
Hello, this is Verne Latham, Arizona State
21
University.
I had a quick question, kind of a
22
two-part thing here.
23
Referring back to the first gentleman, Dr.
24
Pearson, referring to the CTI barrier analysis,
25
the CTI not being identified as the barrier.
26
What the report indicated was that students
27
coming out of the CTI program, whether a part of
28
the protected groups or not, there was no
VTranz

#:267
58
barrier identified for anyone coming out of the
CTI schools.
So what it indicates was no matter who went
through a CTI program, there was not a barrier
identified for those students coming out of the
program.
So the question I got then is here we --
the FAA has had the CTI program all the way back
to the early 90s, been very successful,
10
otherwise, the FAA wouldn't have continued it,
11
they wouldn't have expanded upon it.
12
So my question is, why'd it take -- if this
13
has been a problem recently, why'd it take so
14
long for the FAA and the federal government to
15
identify a barrier being put up to people
16
applying for these jobs?
17
Because it seemed to me the CTI program has
18
been very, very successful as indicated by the
19
length of the program within the -- by the FAA.
20
And then on top of that, the barrier analysis is
21
basically stating that anyone coming out of the
22
CTI program never faced a barrier.
23
UNID MALE:
The -- if I could just jump in, John.
The
24
barrier analysis that was done did not look at
25
applicant sources per se as barriers, it didn't
26
review it that way.
27
28
MR. LATHAM:
Excuse me for interrupting.

interrupting.
Excuse me for
But they did have a statement in
VTranz

#:268
59
there saying that the students coming out of the
CTI program, they were not -- there was no
barrier of the seven barriers that were looked
at.
graduating did not face any barrier -- any of
the barriers, any of those seven steps.
was in -- yeah, that was in the report.
UNID MALE:
That
Right, and the barriers -- when we talk about

barriers, we talk about barriers to diverse
10
11
The CTI students coming out through there
candidates getting through those stages.

MR. LATHAM:
Right, and that's what they were saying, that it
12
didn't matter that any of the protected groups,
13
anyone, any of the minority groups, no one
14
coming out of the CTI program, there were no
15
barriers identified for anyone coming out of the
16
CTI programs.
17
So there's no -- the people coming out of
18
the CTI programs were basically successful all
19
the way through that.
20
Now, the barrier report did also -- also
21
indicated that that did not reflect their
22
eventual success through full certification at a
23
facility.
24
coming out of a CTI program is very, very well
25
prepared to make it through all those seven
26
barriers.
27
28
UNID MALE:
But it did indicate that someone
Yes, and that's the point, they were able to get

through the barriers quite well, but -- or
VTranz

#:269
1
through the selection points quite well, but
there was not the data to show the ultimate
success through academy and onto CPC.
60
And so the barrier analysis looked not just
at CTI as an applicant source, it looked at a
number of different applicant sources.
CTI is simply one of many applicant sources that
this barrier analysis addressed.
So the
So the recommendations that are coming out
10
of it have to look more broadly at all of -- at
11
dealing with the issues associated with all of
12
the -- all of the applicant sources across these
13
decision points.
14
So I think CTI -- the applicant sources are
15
not barriers, it's the decision points that are
16
thought of as barriers for particular diverse
17
groups such as African-Americans, Hispanics, and
18
so on as they move through that.
19
that is the (indiscernible) --
20
MR. LATHAM:
So I think
All I'm saying though is the CTI program has
21
been around for about 20 years.
22
think it was seven schools, and it was
23
eventually expanded in 2007.
24
have is based on the fact that it was such a
25
successful program, but all of a sudden now
26
we're being told that population coming out of
27
CTIs is not diverse enough.
28
Originally, I
So the question I
Why wasn't this identified in the 90s or in
VTranz

#:270
61
the mid parts of 2000, and why wasn't this type
of barrier analysis done at an earlier point?
It makes me kind of suspicious as why all of a
sudden this is a hot issue in 2011, why wasn't
it identified a lot sooner than that?
especially with the CTIs people coming out they
dont face the barriers.
MR. TEIXEIRA:
And
Okay.
So we are mixing apples and oranges.
Okay.
But because --
10
MR. LATHAM:
Not really.
Not really, because we --
11
MR. TEIXEIRA:
Let me speak and then I'll tell you why.
12
MR. LATHAM:
Go ahead.
13
MR. TEIXEIRA:
The fact that CTI students were able to
14
successfully navigate all the application
15
processes within the FAA and offer -- be offered
16
jobs that is what the barriers are about.
17
yes, CTI school students were able to
18
successfully apply and be selected through the
19
old process.
20
them.
21
MR. LATHAM:
So,
No barriers were identified for
No, I thought you said earlier that there
22
wasn't.
That wasnt in the report.
23
you said it wasn't in the report and --
24
MR. TEIXEIRA:
No.
25
MR. LATHAM:
-- now you're saying it is.
26
MR. TEIXEIRA:
Okay.
You're being argumentative.
I thought
So what I'm
27
saying is we didn't look at whether there were
28
any barriers within the CTI schools.
VTranz

#:271
1
MR. LATHAM:
No.
62
What I'm saying though is it was identified
that of the seven barriers that this barrier
analysis looked at, those seven barriers were
not identified as being a barrier for any of the
students coming out of the CTI program.
So in other words, the indication was in
that report are the seven barriers that this
study analyzed that there is no barrier -- none
of those seven barriers prevented a CTI program
10
11
from getting all the way through.

MR. TEIXEIRA:
So clearly everyone -- there was no indication
12
that the FAA process offered any barriers to CTI
13
students, we capitulate on that.
14
Now, also what we -- what that study
15
indicates is that FAA's almost exclusive use of
16
the CTI lists in the past couple years provided
17
the FAA with a pool that wasn't diverse enough.
18
But, again none of these changes are being made
19
solely as a result of that.
20
I mean I've been trying to for the past
21
hour-and-a-half to say we are engaged in a
22
robust change process that involved many changes
23
and we were informed in that process by several
24
large reports.
25
of them, right.
26
says, but we used it.
The barrier analysis is just one

And we can differ on what it
27
We accepted that report as is and we value
28
it, and it does say perhaps different things to
VTranz

#:272
1
different people.
from it and using it.
MR. LATHAM:
63
But we are taking information
Well, I'd like to make one other point and I'll

get off and let other people ask a question.
But, you know, this had been mentioned by
various people at different schools is, now
Terry Crab (phonetic) over the last year, year-
and-a-half told us on numerous occasions, either
in telcoms, or at the CTI conferences, that the
10
FAA was very satisfied with the diversity of the
11
student pool in the CTI schools overall, but
12
there was -- the FAA had no problem with the
13
diversity pool that existed within the CTI
14
program.
15
diverse enough.
16
Now, we're being told that it's not
So, you know, I feel maybe Terry didn't
17
have all the information, or the FAA, you know,
18
basically, was throwing this out there for us.
19
But, you know, it's kind of disingenuous to be
20
saying that to us and all of a sudden have this
21
come out now, like you just said right here,
22
which is also in the barrier analysis report,
23
that the diversity in the CTI programs is not
24
what the FAA would like it to be so --
25
MR. TEIXEIRA:
No.
26
MR. LATHAM:
-- the question --
27
MR. TEIXEIRA:
I am not saying that, so a couple of --
28
MR. LATHAM:
I'm just saying that's what Terry -- that's what
VTranz

#:273
1
we were told by the FAA, that the diversity of
the schools, the FAA was very pleased with.
MR. TEIXEIRA:
64
Okay.
So the numbers are the numbers, you have
them.
But before we go through that, I want to
simply say look, the CTIs have done a
magnificent job in doing outreach to
communities, having outreach to high schools,
creating incentives for people to -- in the
minority work groups locally to participate in
10
these programs, and to offering grants and
11
scholarships, so that program has been terrific,
12
right?
13
The numbers of minorities graduating from
14
CTI schools, I mean those numbers are available
15
in the report.
16
numbers are, you can see them, you can read
17
them, you can reach whatever decisions are.
18
We're transparent on what the
Well, also what I'm saying is that as a
19
totality of all the people that we have hired
20
from all the sources we have a deficient pool,
21
and we need to do things in all the processes
22
that we're engaging to improve, we need to
23
improve our recruitment of minority students and
24
women as a commitment that we have and an
25
obligation as a government organization.
26
And we are using the recommendations in
27
that barrier analysis to make those improvements
28
among many other improvements were making.
VTranz
65

#:274
1
MR. LATHAM:
Oh, I'm fine.
If you guys want to leave it at
that way, that's fine.
questions.
MR. TEIXEIRA:
5
6
Thank you.
I am, and I would like to announce that we

probably need to take two last questions.
THE OPERATOR:
7
8
I have no further
Thank you, and the next question is from Wayne

Ressitar.
MR. RESSITAR:
Good afternoon, guys, thanks for taking the

question.
10
MR. TEIXEIRA:
Hi, Wayne.
11
MR. RESSITAR:
So we just spoke a little bit ago about there --
12
THE OPERATOR:
Wayne Ressitar, your line is open.
13
MR. RESSITAR:
Okay.
14
Can you guys hear me?
Can you hear me?
Can you hear me?
15
UNID MALE:
We're going to our last --
16
MR. RESSITAR:
Can you hear me?
17
UNID MALE:
-- caller then, please.
18
UNID MALE 2:
Push the star --
19
THE OPERATOR:
Thank you, and the last question is from Julie
20
Moore (phonetic); your line is open.
21
UNID MALE:
Hi, Julie.
22
MR. RESSITAR:
Nobody hear me?
23
UNID MALE:
Hear you now.
24
MR. RESSITAR:
Oh, this Wayne again, I got cut off there.
25
Can
I ask it real quick?
26
UNID MALE:
Absolutely.
27
MR. RESSITAR:
Okay.
28
Hello.
So we just talked about a minute ago
about this list that the FAA picked from, and
VTranz

#:275
1
now it wasn't diverse enough so now we know
there was a list.
66
So at CCBC, here our enrollment basically
is about 60 percent plus out of state, so they
come here to get their ATC degree and hopefully
make it on this list.
list and the way we see it there goes 60 percent
of our enrollment.
So now there's no more
They might as well stay home in Des Moines,
10
Iowa, go down the street to a community college,
11
take a couple courses, maybe a couple weather
12
courses, ATC course, have a couple years
13
progressive work experience, and apply -- and
14
they're going to be considered equally as though
15
you came to a school like this; am I correct?
16
MR. TEIXEIRA:
17
Wayne, I need to be given an opportunity to

correct what you said.
18
MR. RESSITAR:
Okay.
19
MR. TEIXEIRA:
There was no list.
The barrier analysis was a
20
retroactive look at the people we hired, right?
21
So it's not looking forward.
22
at the pool.
23
It's not looking
It's not looking at any list.
It's looking at what was the pool of
24
individuals that we hired in the past four or
25
five years.
26
MR. RESSITAR:
Okay.
So no lists.
Well, I meant to say that the
27
recommendation list that came to the FAA from
28
the CTI schools, that recommendation list that
VTranz

#:276
1
we thought we had before, and now that's going
to be gone.
That's the list I was referring to.
MR. TEIXEIRA:
Correct.
MR. CANNON:
Yeah.
MR. RESSITAR:
So --
MR. CANNON:
You are correct, now those individuals, you
Yeah.
know, whether they are attending a CTI school,
or not attending a CTI school --
9
10
MR. RESSITAR:
Right.
MR. CANNON:
-- will be free to apply if they have interest
11
12
67
in that particular occupation.

MR. RESSITAR:
Right. Yes, sir, I understand that.
So what I'm
13
saying is students came from out of state to our
14
school to get their ATC degree and hopefully get
15
on this list that no longer is going to exist.
16
So why will students consider coming here?
17
There is no advantage of coming here to
18
hopefully get on a list that's no longer going
19
to exist.
20
school back there and apply to the FAA back
21
there someplace because there's no advantage of
22
coming here anymore.
23
MR. TEIXEIRA:
Okay.
You might as well stay home and go to
Wayne, I think we -- we've answered this
24
question several times during this call, and we
25
said many times that just because you were on a
26
list, and we knew that you graduated from an AT
27
school, that didn't really give you any
28
advantages.
VTranz

#:277
1
I mean we can continue to think that is
did, but it didn't.
MR. RESSITAR:
It did.
MR. SCOTT:
Okay.
And if I could add a question here, this
is Jim Scott, also from Community College of
Beaver County.
driven by diversity, and I'm just kind of
wondering how you think this is going to be
better than the CTI schools are doing.
A lot of this I understand is
How are
10
you going to achieve diversity when everybody
11
enters on a level field?
12
68
MR. TEIXEIRA:
Okay.
So we have not said that any of this
13
stuff is driven by diversity.
And it's also
14
clear that the process that we're putting in
15
place is gender and race neutral.
16
no guarantees that there will be, when we do
17
another barrier analysis two years from now,
18
that we will have improved our outcome, although
19
it is our hope that it would.
And we have
20
But that the -- those issues are not
21
driving these changes, and we have not designed
22
any system to target the acquisition of
23
minorities. We're designing an open and neutral
24
system for everybody, and we're going to put
25
everybody through the same way.
26
MR. SCOTT:
Well, I was under the --
27
MS. BOSTICK:
This is not the only process -- this is Carrolyn
28
Bostick.
This is not the only process that the
VTranz
69

#:278
1
FAA will look at.
I mean what good organizations do is from
time to time they look at their processes, they
review them, they make -- they improve them.
And there's nothing suspicious about this.
This is one of our processes that we're looking
at, and we will continue to look at it, and we
will also look at other things.
about trying to -- the goal here is not
But this is not
10
achieving diversity.
11
and consistent process that is perceived as fair
12
and by all people who are trying to partake in
13
it.
14
The goal here is a fair
And if we, through this process achieve
15
greater diversity, well isn't that awesome.
16
the goal is not specifically about achieving
17
diversity.
18
understands that.
19
20
MR. JONES:
But
It's very important that everyone
And with that, Operator, I think that was our

final question.
21
If I could just wrap this up and thank
22
everybody for taking the time to join us in this
23
telcon.
24
impactful to you and you have a lot of concerns
25
both as organizations, as aviation professionals
26
and concerns for your individual students.
27
28
We know that this is extremely
So we know that there's going to be a lot

of work yet ahead of us in continuing to foster
VTranz
70

#:279
1
and build our relationship with the CTI academic
institutions, we are committed to do that.
know that this is impactful to your business
education models that you currently are using
and have been using for several years.
We
But we also see this as an opportunity that
we can work with you to help develop your future
in the role in this.
So with that, Operator, we'll go ahead and
10
11
sign out.
UNID MALE:
12
13
Becka, if you could pull our line from the

conference for a final line count, please?
THE OPERATOR:
Yes.
Thank you.
And thank you all for your
14
participation.
15
One moment while I transfer speaker.
16
17
UNID MALE:
You may disconnect at this time.
Bullshit.
(Recording Ends)
18
19
20
21
22
23
24
25
26
27
28
VTranz

#:280
1
CERTIFICATE
I certify that the foregoing is an accurate transcript,
produced to the best of my ability, from the electronic sound
recording provided by the ordering party.
5
6
7
8
9
Dated: January 23, 2014
______________________________
DEBRA E. SHEA
10
AVTranz
11
845 North 3rd Avenue
12
Phoenix, AZ 85003
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
VTranz
71

#:281
EXHIBIT 4

#:282
From: [email protected] <[email protected]>

Sent: Monday, December 30, 2013 4:18 PM
To: [email protected]; [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected]; [email protected]; Fischer, Sam;
Cheatum, Carter L.; Gallion Jr, Donald K.; Cabral-Maly , Margarita A.; [email protected];
[email protected]; [email protected]; [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected]; [email protected]; [email protected];
[email protected]
Subject: Hiring of Air Traffic Controllers by the FAA
Dear Colleagues,
The quoted text below, is an extract from letters we sent today to all our
primary Collegiate Training Initiative (CTI) contacts. This email is being
Case 2:15-cv-05811-CBM-SS
Document
27-1
sent to an expanded
list of CTI stakeholders to initiate
a dialogue on
upcoming changes to the hiring process for of air traffic controllers #:283
by
the FAA.
Filed 04/25/16 Page 88 of 342 Page ID
"The Federal Aviation Administration (FAA) has enjoyed a long-standing

relationship with your organization and values our partnership in the
training of potential Air Traffic Controllers (ATC). Recently, the FAA
completed a barrier analysis of the ATC occupation pursuant to the Equal
Employment Opportunity Commission's (EEOC) Management Directive 715. As a
result of the analysis, recommendations w ere identified that we are
implementing to improve and streamline the selection of ATC candidates.
These improvements will have a direct and present impact on all hiring
sources, including CTI.
An overview of the immediate changes being made to the ATC hiring process
is presented below.
Revisions to ATC Hiring Process
A nationwide competitive FG-01 vacancy announcement open to all U.S.
Citizens will be issued in February 2014. Any individual desiring
consideration for employment (including CTI graduates) MUST apply.
Existing inventories of past applicants will not be used.
All applicants will be evaluated against the same set of

qualification standards. Specifically, applicants must have at least
3 years of progressively responsible work experience, a 4 year
degree, or a combination of the two.
The existing testing process has been updated. The revised testing
process is comprised of a biographical questionnaire (completed as
part of the application process) and the cognitive portion of the
AT-SAT. The cognitive portion of the AT-SAT will be administered
only to those who meet the qualification standards and pass the
biographical questionnaire. Applicants for the February 2014
announcement will be required to take and pass the new assessments in
order to be referred on for a selection decision.
Since a single vacancy announcement will be used for all applicant

sources, a single nationwide referral list will be generated
containing all candidates who meet the qualification standards and
pass the assessments. Location preference will no longer be used as a
determining factor for referral or selection.
Centralized selection panels will no longer be convened to make selections

from the referral list. Selections will now be fully automated, grouping
candidates by assessment scores and veteran's preference.
These improvements to the ATC hiring process will significantly strengthen
the long term sustainability of our program and offer our candidates a fair
and viable opportunity to demonstrate their capabilities and potential for
the ATC position.
We recognize that you may have questions concerning these changes.
Considering the upcoming holiday season, w e are planning a teleconference
for Mid-January when we will more fully address questions and concerns you
may have.
We want to reiterate that we very much value our partnership with the CTI
program and look forward to assisting you in understanding our changes to
the ATC selection process. We will be contacting you soon to schedule the
January teleconference."
Best Regards, Joseph

#:284
Joseph Teixeira
Vice President for Safety &
Technical Training
Air Traffic Organization
Tel: 202-267-3341
Email: [email protected]
(Embedded image moved to file: pic26439.gif)

#:285
EXHIBIT 5

#:286
!33(%6/13%.%01%2%.3!3)5% (%
)/'1!0()#!,4%23)/..!)1%)2!)3%-).5%.3/183(!36!2$%5%,/0%$
"!2%$/.)3%-2&1/-6%.29)/'1!0()#!,4%23)/..!)1%6%.2#(/%.&%,$3
(%)3%-23!0%)'(3!1%!2
%$4#!3)/.!,"!#+'1/4.$
01)/1-),)3!18/1#)5),)!.%70%1)%.#%).
)-0/13!.#%0,!#%$/.5!1)/42&!#3/12%'2!,!18"%.%&)32*/"2%#41)38
3)-%%70%#3%$3/"%#/-%!.%&&%#3)5%
#/--)3-%.33/!.#!1%%1
6/1+1%,!3%$!33)34$%2
%70%#3%$2!3)2&!#3)/.6)3(!20%#32/&#!1%%12!.$
'%.%1!,0%12/.!,).&/1-!3)/.%'2/#)/%#/./-)#23!342'1/6).'40!,#/(/,
!.$3/"!##/42!'%/,,).2%3!,

#:287
EXHIBIT 6

#:288
Federal Aviation
Administration
ATC Hiring
Stakeholder Briefing
On
Hiring Process
Congressional Briefing Coordination

FAA
Date: 16 January, 2015

#:289
Background
Interim Changes to the ATCS hiring process went into effect with
the February 10, 2014 announcement. Key changes included:
1 vacancy announcement
1 set of qualifications/eligibility => simplified min qual/eligibility review
Multi-hurdle selection process => increased validity & efficiency of process

Eliminated CSP & Interview
=> shortened process and removed potential subjectivity
Resulted in 1,591 selections
=> streamlined announcement process
Job Task Analysis (JTA) of ATCS occupation

Biographical Assessment revision
Current process relied on results from interim process, Job Task
Analysis and assessment of alternative courses of action
Dialogue and updates with CTI Schools, CTI School Associations,
FAA Employee Associations, and NATCA
Federal Aviation
Administration

#:290
Biographical Assessment (BA)

Newly developed BA driven by recent job analysis
and subject matter expert input
Empirically validated against key performance
indicators
(e.g., Maintaining Safe & Efficient Air Traffic Flow, Maintaining
Attention & Vigilance, Prioritizing, Technical Knowledge,
Teamwork)
Alternate forms created to enhance security and

validity
BA items randomized
Federal Aviation
Administration

#:291
Hiring Process Summary

Two-track approach consisting of a:
General Experience/Education Track
External (All U.S. Citizens) general public vacancy announcement (FG1)
Focus is on reaching candidates without operational ATCS experience,
but who have education and/or other general work experience
Specialized ATC Experience Track

External (All U.S. Citizens) announcement (AT-AG)
Focus is on reaching ATCS candidates with operational ATCS
experience (e.g., reinstatement / former CPCs, former military with
ATCS experience)
Retain authority for non-competitive appointments outside of

standard two-track approach for critically staffed facilities
To be used on a very limited basis
Federal Aviation
Administration

#:292
General Experience/Education Only Track

March 2015 announcement; Temporary, 2152 FG-1
ATO will determine the number of hires needed for 18-24 months
Announcement open period will be sufficient to reach total number of applications
required to meet projected hiring needs
Candidates MUST pass qualifications tests

Revised Biographical Assessment; Increased security
Alternate version of the AT-SAT
All candidates who pass qualifications tests provided tentative offer

letter
All qualified candidates who then meet all pre-employment requirements will receive a
firm offer letter
New hires attend Academy for initial training, facility offered upon
graduation
Service area preference considered but final facility assignment and
option based on Agency need
Includes consideration of those candidates impacted by age-related
provision in the FY15 Consolidated Appropriations Act
Federal Aviation
Administration

#:293
Specialized Experienced Track

52 weeks of post-certification ATC experience
Permanent, 2152 AT-AG (or higher pay level, if CBA provides)
Specialized ATC experience requirement eliminates need for
qualifications tests
Hired for/go directly to a specific facility (training at Academy as
required)
Service area preference considered but final facility assignment
and option based on Agency need
January 2015 announcement with reference to the March 2015
announcement for General Experience/Education Track candidates
Includes consideration of those candidates impacted by agerelated provision in the FY15 Consolidated Appropriations Act
Federal Aviation
Administration

#:294
Process Overview
Specialized = Yes
ATC Experience
52 Weeks
Certified ATC
Experience?
No =
General
Experience/Education
March Announcement
January Announcement
Includes requirement
for 52 weeks of postcertification ATC
experience
Min Quals
TOL
Min Quals
Biographical Assessment
AT-SAT
Medical/Security/Suitability
TOL
Medical/Security/Suitability
FOL
FOL
Facility Placement
(2nd
Qtr 2015)
Option for non-competitive

appointments for critically
staffed facilities
1st Academy Class (4th Qtr 2015)
Federal Aviation
Administration
#:295
Future Improvements for FY2017 Hiring

Introduction of new validated cognitive ability test to replace
AT-SAT
Projected first use for announcement in Fall 2016
Applies to General Experience/Education, FG-01 track only
Federal Aviation
Administration
#:296
EXHIBIT 7
#:297
Subject: Fwd: All Sources Announcement February 2014
...
From:
Date
To:
Sub
ry 2014
From: <[email protected]>
Date: Jan 27, 2014 16:01
Subject: All Sources Announcement February 2014
To:
Cc:
#!!%0
#*& $%#& 56$!&$% !# *$%#%
$& #%# %# $!$%$562 )#$% )#)##%!!%*% #-$
!%-%$$2
!-0%)##%!!%*% #$0)- )#!!& 0#%!# $$
$+ #)&.2+!!%*% #-+$%$$ #)##!!& %%- )$)(+ # $#2
;9:= )%2-!#
)%2
)+
% !!- #%#)#- )% ##% !% #%#-% %+
+*% #-
#-
%+*%
% #*)#%# $#& #! -%$2
*#*+ % %#$% %#!# $$)$1

& + !&&*049:0*- )% !% 22&.$+$$)
#)#-0;9:=2-*)$#)%)# $#& #! -%)$%!!-)#%+
Case
2:15-cv-05811-CBM-SS
Document
27-1 Filed 04/25/16 Page
103 of
!# $$)#%
!!# %*-
)%2!!%$+%%%&*
#(#$
342 Page ID
#:298
! -%+ &)% !# $$2

!!%$+*)%$%%$")& $#%#2!-0!!%$)$%
! $$$$%$%%#-#$ )4&+ #,!# # #3$#0 # & %
%+ % ")%#-#$2

!#4!
!#$ %+ 4!#%!# $$
!#4! -%%$%$#*$2#*$%$%$
-%%$%
")$& #
# !%)#%&!!& $)$")%%$%$%#
)")$&
! $%4!!& 2$)$")%%$%+ -$%#% % $+ %%")&
$%#$!$$%!!%")$& #2

!!%$+#")#% %!$$%+$$$$%$ ##% ### #
$& $#& 2

$*- )%+)$ #!!%$ )#$0#$)&$
& +###$% %")%$+ %%")& $%#$!$$
%!#4! -%%$%2% )!!% #!!##$+ $#0 #!
!##$+ #)$ #### #$& % $$%-!-+#
% %%%$2
# .%$$# %!#* )$!!& !# $$2- )*")$& $ #
)#%##& %$$0!$#% %%*& ##$%=9>4B>=4=?>@
-% #-# A19922% =1<9!22%#%#2
#-0

!)%-$$$%%$%#% # #
)
$ )# %
#:299
EXHIBIT 8
1/3/2016
Hiring update from

the National
- Google Groups
Case 2:15-cv-05811-CBM-SSATCDocument
27-1
FiledPresident
04/25/16
Page 105 of 342 Page ID
#:300
Google Groups
ATC Hiring update from the National President

James Swanson Jr
Posted in group: NBCFAEinfoWESTPAC
Jan 24, 2014 10:50 PM
NBCFAE Family,
Please read the information below carefully. This process is constantly evolving.
So here's what is fact so far.

ns
1. The controller vacancy announcement is open to all US citizens that meet the minimum qualifications
of the vacancy. Everyone interested should bid. The list will also be used to fill future vacancies.
p that
2. Rumors have been spreading that (TOL) temporary offer letters may still be offered to people
g the old hiring
g process..
p
y know this has been a big
g issue for NBCFAE. I confirmed
went through
As you
yesterdayy with agency
y
g
y leadership
p including
g AHR-1 that the agency
g
y will not offer jobs to people that may
ma
ay
have been in that pipeline. Their words were "That list has been purged". So
S please tell everyone
impacted by this to apply on the upcoming bid.
3. CTI schools. During the holidays CTI schools were informed that they will no longer receive the
preferences they have been receiving. I received a lot of feedback from people impacted the change.
Some of it was very negative and some centered around not understanding why NBCFAE fought the
ply
issue the way it did. That conversation will continue but the bottom line is All CTI students need to apply
on the upcoming bid.
4. Veterans Preference remains. There will be more coming on this one.
5. There is an effort to hire people with targeted disabilities to work in the ATO. The ATO has set a
2014 goal of hiring 10 people with targeted disabilities. The key word is targeted which are defined.
More to come.
What we do not know

e
We do not know exactly what the new ATSAT test will look like. We have a general idea based on the
skill set needed to perform the job. The test will have two components. A biographical test and a
cognitive test. You will have to pass the biographical portion to take the cognitive portion.
We do not know exactly how the selection factors will be applied but we do know that a diverse pool
must come from the process.
The FAA is still working both these issues and as soon as I know you will know. Please bear with me if
it takes time. I want to share the correct information the first time.
We will be offering online training for persons interested in learning the types of skill set assessments
we believe will be apart of the ATSAT test.
WHATS DIFFERENT
ere
The hiring process for applicants will be a modified Pepsi which means there will be five locations where
a person can travel at their own expense to go through the hiring process before attending the
academy. The locations are Seattle, Dallas, Atlanta, Chicago and the fifth site is to be determined.
https://groups.google.com/forum/print/msg/nbcfaeinfowestpac/7k6CeXLyoSI/gi4_MaHs3YgJ?ctz=3159682_88_88_104280_84_446940
1/3
1/3/2016
Hiring update from

the National
- Google Groups
27-1
FiledPresident
04/25/16
#:301
Applicants will be able to go through the security and human resources part of the process but will have
ve
to handle their medical clearance separately. If a person chooses not to use the Pepsi process then
they will still have the option to use the standard hiring process which takes more time. More to come.
The Federal Aviation Administration (FAA) will be issuing a large number of vacancy announcements
on
February 10, 2014, for air-traffic control specialists on a nation-wide basis. It will only be open for 10
days.
http://www.faa.gov/jobs/career_fields/aviation_careers/
Visit the FAA Virtual Career Fair and learn all about select aviation
careers FAA is offering. FAA recruitment experts will be available for
live chats on Jan. 29, 124 p.m. EST, and Feb. 12, 124 p.m. EST.
To register for the Career Fair and to learn about these aviation careers,
please visit: http://vshow.on24.com/vshow/network/registration/5492
Applicants are highly encouraged to use the resume builder available on the
USAJOBS website usajobs.gov.
Visit the USAJOBS Resource Center at help.usajobs.gov/ to learn how to
build your resume, and access tips and tutorials on applying and
interviewing for federal jobs.
Let me say a few things in closing.

We have been very successful in spreading the word on this announcement and it is no
se
surprise, especially in these times, that the response has been enormous. I have seen so many diverse
and talented young folks looking for opportunities. Our challenge is to assist them in any way we can to
find opportunities wherever they can including in other areas of the federal government
if possible. More to come and we will need your help.
NBCFAE has a proud history of helping everyone who ask us for help. We do not ask the ethnicity of
anyone seeking our assistance and we want the best and brightest to work for the FAA. I believe whatt is
sometimes lost is that we are also the best and brightest. NBCFAE's goals include assisting in
recruiting African Americans, females, and minority individuals into the FAA and to promote equal
employment opportunities through all lawful means. We do not apologize for our commitment to that
end.
We encourage everyone to join us as members and hope that our organizations work speaks for itself
when people consider becoming a member
Please sign on the membership section of www.NBCFAE.org. Go to the Resources header. Use the
drop down menus to go to Documents and then to ATC hiring for additional information.
Please email any questions you have to me. I will compile them and send a Frequently Asked
Questions next week,
Have a wonderful weekend.
In Unity,
2/3
1/3/2016
Hiring update from

the National
- Google Groups
27-1
FiledPresident
04/25/16
#:302
Roosevelt Lenard, Jr.
NBCFAE National President
3/3
#:303
EXHIBIT 9
Shown below is the websiteDocument

that was live on the faa.gov
website from
8/10/201104/25/16
through 2/25/2014.
Case 2:15-cv-05811-CBM-SS
27-1
Filed
This site provided details on the Air Traffic Collegiate Training Initiative (AT-CTI) stating that the program is designed to
#:304
provide qualified applicants to fill developmental air traffic
control specialist positions and The FAA hopes to employ
all eligible CTI graduates This site directed people to Graduate from an FAA approved AT-CTI program in order to
become an air traffic controller and attracted thousands of students to attend approved CTI programs.
#:305
EXHIBIT 10
#:306
DOT/FAA/AM-01/5
Office of Aviation Medicine

Washington, D.C. 20591
Documentation of Validity
for the AT-SAT
Computerized Test Battery
Volume I.
R.A. Ramos
Human Resources Research Organization
Alexandria, VA 22314-1591
Michael C. Heil
Carol A. Manning
Civil Aeromedical Institute
Federal Aviation Administration
Oklahoma City, OK 73125
March 2001
Final Report
This document is available to the public

through the National Technical Information
Service, Springfield, Virginia 22161.
U.S. Department
of Transportation
Federal Aviation
Administration
#:307
N O T I C E
This document is disseminated under the sponsorship of
the U.S. Department of Transportation in the interest of
information exchange. The United States Government
assumes no liability for the contents thereof.
#:308
Technical Report Documentation Page

1. Report No.
2. Government Accession No.
3. Recipient's Catalog No.
DOT/FAA/AM-01/5
4. Title and Subtitle
5. Report Date
Documentation of Validity for the AT-SAT Computerized Test

Battery, Volume I
March 2001
7. Author(s)
8. Performing Organization Report No.

1
Ramos, R.A. , Heil, M.C. , and Manning, C.A.
9. Performing Organization Name and Address

1
Human Resources Research

Organization
68 Canal Center Plaza, Suite 400
6. Performing Organization Code
10. Work Unit No. (TRAIS)
FAA Civil Aeromedcial Institute

P. O. Box 25082
12. Sponsoring Agency name and Address
11. Contract or Grant No.
13. Type of Report and Period Covered

800 Independence Ave., S. W.
14. Sponsoring Agency Code
15. Supplemental Notes
Work was accomplished under approved subtask AM-B-99-HRR-517

16. Abstract
This document is a comprehensive report on a large-scale research project to develop and validate a
computerized selection battery to hire Air Traffic Control Specialists (ATCSs) for the Federal Aviation
Administration (FAA). The purpose of this report is to document the validity of the Air Traffic Selection
and Training (AT-SAT) battery according to legal and professional guidelines. An overview of the project
is provided, followed by a history of the various job analyses efforts. Development of predictors and
criterion measures are given in detail. The document concludes with the presentation of the validation of
predictors and analyses of archival data.
17. Key Words
18. Distribution Statement
Air Traffic Controllers, Selection, Assessment,

Job Analyses
Document is available to the public through the

National Technical Information Service,
Springfield, Virginia 22161
19. Security Classif. (of this report)
Unclassified
20. Security Classif. (of this page)
Unclassified
Form DOT F 1700.7 (8-72)
21. No. of Pages
22. Price
165
Reproduction of completed page authorized
#:309
ACKNOWLEDGMENTS
The editors thank Ned Reese and Jay Aul for their continued support and wisdom throughout
this study. Also, thanks go to Cristy Detwiler, who guided this report through the review process
and provided invaluable technical support in the final phases of editing.
iii
#:310
TABLE OF CONTENTS
VOLUME I.
Page
CHAPTER 1 AIR TRAFFIC SELECTION AND TRAINING (AT-SAT) PROJECT .......................................................... 1
CHAPTER 2 AIR TRAFFIC CONTROLLER JOB ANALYSIS ...................................................................................... 7
Prior Job Analyses ................................................................................................................................. 7
Linkage of Predictors to Work Requirements ...................................................................................... 14
CHAPTER 3.1 PREDICTOR DEVELOPMENT BACKGROUND ................................................................................ 19
Selection Procedures Prior to AT-SAT ................................................................................................. 19
Air Traffic Selection and Training (AT-SAT) Project ............................................................................ 21
AT-SAT Alpha Battery ......................................................................................................................... 23
CHAPTER 3.2 AIR TRAFFIC - SELECTION AND TRAINING ALPHA PILOT TRIAL AFTER ACTION REPORT ............... 27
The AT-SAT Pilot Test Description and Administration Procedures ................................................... 27
General Observations .......................................................................................................................... 28
Summary of the Feedback on the AT-SAT Pilot Test Battery ............................................................... 35
CHAPTER 3.3 ANALYSIS AND REVISIONS OF THE AT-SAT PILOT TEST ............................................................. 37
Applied Math Test ............................................................................................................................... 37
Dials Test ............................................................................................................................................ 38
Angles Test .......................................................................................................................................... 38
Sound Test .......................................................................................................................................... 38
Memory Test ....................................................................................................................................... 39
Analogy Test ........................................................................................................................................ 39
Testing Time ....................................................................................................................................... 40
Classification Test ............................................................................................................................... 41
Letter Factory Test ............................................................................................................................... 42
Analysis of Lft Retest ........................................................................................................................... 43
Scan Test ............................................................................................................................................. 47
Planes Test ........................................................................................................................................... 48
Experiences Questionnaire .................................................................................................................. 49
Air Traffic Scenarios ............................................................................................................................ 52
Time Wall/Pattern Recognition Test .................................................................................................... 54
Conclusions ........................................................................................................................................ 55
REFERENCES .............................................................................................................................................. 55
#:311
List of Figures and Tables

Figures
Figure 2.1.
Sample description of an AT-SAT measure. ............................................................................... 61
Figure 2.2.
Example of Linkage Rating Scale. .............................................................................................. 61
Figure 3.3.1. Plot of PRACCY*PRSPEED. Symbol is value of TRIAL .......................................................... 62
Tables
Table 2.1.
Table 2.2.
SACHA-Generated Worker Requirements ................................................................................. 63

Worker Requirements Generated Subject Matter Experts .......................................................... 65
Table 2.3.
Table 2.4.
Revised Consolidated Worker Requirements List, With Definitions ......................................... 66

Mean Worker Requirement Ratings Rank Ordered for all ATCSs ............................................. 68
Table 2.5.
Table 2.6.
Worker Requirement Ratings for Doing the Job for the Three Options and All ATCSs ............ 71
Worker Requirement Ratings for Learning the Job for the Three Options and All ATCSs ......... 73
Table 2.7.
Table 2.8.
Survey Subactivities for All ATCSs Ranked by the Mean Criticality Index ................................ 75
Worker Requirement Definitions Used in the Predictor-WR Linkage Survey ............................ 78
Table 2.9.
Table 2.10.
Number of Raters and Intra-Class Correlations for Each Scale .................................................. 81

AT-SAT Tests Rated as Measuring Each SACHA-Generated Worker Requirement .................... 82
Table 2.11.
Table 3.1.1.
Indicators of the Success of AT-SAT Measures in Measuring Multiple Worker Requirements.... 86

Regression Coefficients for PTS Pre-Training Screen ................................................................. 87
Table 3.1.2.
Table 3.1.3.
Regression Table for Pre-Training Screen ................................................................................... 87

Meta-Analysis of Prior ATCS Validation Studies ....................................................................... 89
Table 3.1.4.
Table 3.1.5.
Proposed New Measures for the g WR Constructs ..................................................................... 91

Proposed New Measures for the Processing Operations WR Constructs .................................... 92
Table 3.1.6.
Table 3.2.1
Temperament/Interpersonal Model ............................................................................................ 93

Pilot Test Administration: Test Block Sequencing ...................................................................... 94
Table 3.2.2
Table 3.2.3
Air Traffic Scenarios Test. Example of Plane Descriptors ........................................................... 94

Summary of Proposed Revisions to the AT-SAT Pilot Test ......................................................... 95
Table 3.3.1.
Item Analyses and Scale Reliabilities: Non-Semantic Word Scale on the Analogy Test
(N=439) .................................................................................................................................... 97
Table 3.3.2.
Table 3.3.3.
Item Analyses and Scale Reliabilities: Semantic Word Scale on the Analogy Test ....................... 98
Item Analyses and Scale Reliabilities: Semantic Visual Scale on the Analogy Test ...................... 99
Table 3.3.4.
Table 3.3.5.
Item Analyses and Scale Reliabilities: Non-Semantic Visual Scale on the Analogy Test ............ 100
Distribution of Test Completion Times for the Analogy Test ................................................... 100
Table 3.3.6.
Table 3.3.7.
Estimates of Test Length to Increase Reliability of the Analogy Test ........................................ 101
Item Analyses and Scale Reliabilities: Non-SemanticWord Scale on the Classification Test ..... 101
Table 3.3.8.
Item Analyses and Scale Reliabilities: Semantic Word Scale on the Classification Test ............. 102
vi
#:312
Table 3.3.9. Item Analyses and Scale Reliabilities: Non-Semantic Visual Scale on the Classification Test ... 102
Table 3.3.10. Item Analyses and Scale Reliabilities: Semantic Visual Scale on the Classification Test ............ 103
Table 3.3.11. Distribution of Test Completion Times for theClassification Test (N=427) ............................. 103
Table 3.3.12. Estimates of Test Length to Increase Reliability of the Classification Test ................................ 104
Table 3.3.13. Planning/Thinking Ahead: Distribution of Total Number Correct on the Letter Factory Test . 104
Table 3.3.14. Distribution of Number of Inappropriate Attempts to Place a Box in the Loading Area on the
Letter Factory Test (Form A) (N = 441) ................................................................................... 104
Table 3.3.15. Recall from Interruption (RI) Score Analyses on the Letter Factory Test (Form A) .................. 105
Table 3.3.16. Planning/Thinking Ahead: Reliability Analysis on the Letter Factory Test (Form A) ............... 105
Table 3.3.17. Situational Awareness (SA) Reliability Analysis: Three Scales on the Letter Factory Test ....... 106
Table 3.3.18. Situational Awareness (SA) Reliability Analysis: One Scale on the Letter Factory Test
(Form A) .................................................................................................................................. 108
Table 3.3.19. Planning/Thinking Ahead: Distribution of Total Number Correct on the Letter Factory Test
(Form B) .................................................................................................................................. 108
Table 3.3.20. Distribution of Number of Inappropriate Attempts to Place a Box in the Loading Area on the
Letter Factory Test (Form B) (N = 217) ................................................................................... 109
Table 3.3.21. Tests of Performance Differences Between LFT and Retest LFT (N = 184) ............................. 109
Table 3.3.22. Distribution of Test Completion Times for the Letter Factory Test (N = 405) ......................... 109
Table 3.3.23. Proposed Sequence Length and Number of Situational Awareness Items for the Letter
Factory Test.............................................................................................................................. 110
Table 3.3.24. Distribution of Number Correct Scores on the Scan Test (N = 429) ....................................... 110
Table 3.3.25. Scanning: Reliability Analyses on the Scan Test ....................................................................... 111
Table 3.3.26. Distribution of Test Completion Times for the Scan Test (N = 429) ....................................... 112
Table 3.3.27. Reliability Analyses on the Three Parts of the Planes Test ........................................................ 112
Table 3.3.28. Distribution of Test Completion Times for the Planes Test ...................................................... 112
Table 3.3.29. Generalizability Analyses and Reliability Estimates .................................................................. 113
Table 3.3.30. Correlations of Alternative ATST Composites with End-of-Day Retest Measure .................... 115
Table 3.3.31. Time Distributions for Current Tests ....................................................................................... 116
Appendices
Appendix A AT-SAT Prepilot Item Analyses: AM (Applied Math) Test Items That Have Been Deleted ....... A1
Appendix B Descriptive Statistics, Internal Consistency Reliabilities, Intercorrelations, and Factor Analysis
Results for Experience Questionnaire Scales ............................................................................. B1
vii
#:313
CHAPTER 1
AIR T RAFFIC SELECTION
AND
TRAINING (AT-SAT) PROJECT
Robert A. Ramos, HumRRO
INTRODUCTION
Starting in fiscal year 2005, a number of the post-1981

hires will start to reach retirement eligibility. As a
consequence, there is a need for the Air Traffic Service
to hire five to eight hundred ATCSs candidates a year
for the next several years to maintain proper staffing
levels. The majority of the new hires will have little
background in ATCSs work. Further, it generally takes
two to four years to bring ATCS developmentals to the
full performance level (FPL).
In addition, the FAA Air Traffic Training Program
has designed major changes in the staffing and training
of new recruits for the ATCS position. In the past,
training at the FAA Academy included aspects of a
screening program. The newly developed AT-SAT selection battery is designed to provide the vehicle that
will screen all candidates into the new Multi-Path
Training Model. One of the important characteristics
of the new training process is that it will no longer have
a screening goal. The program will assume that candidates have the basic skills needed to perform the work of
the ATCS. To implement the new training model, a
selection process that screens candidates for the critical
skills needed to perform the job is required. A Multipath hiring model implemented with AT-SAT and
augmented by a revised training program will likely
reduce ATCS training length and time to certification.
Given this background, i.e., the demographics related to potential retirements, and new staffing requirements associated with training, there was a need to start
the ATCS recruiting, selection, and training process in
fiscal year 1997-1998. In spite of this immediate need to
hire recruits, there were no currently feasible selection
processes available to the FAA for use in the identification and selection of ATCSs. Test batteries that had
been used in the past had become compromised, obsolete, or were removed from use for other reasons.
A two-stage selection process consisting of an OPM
test battery and a nine-week Academy screen was introduced during the 1980s to select candidates for the
position of air traffic controller. This two-stage process
was both expensive and inefficient. First, candidates
took a paper-and-pencil test administered by the Office
This document is a comprehensive report on a largescale research project to develop and validate a computerized selection battery to hire Air Traffic Control
Specialists (ATCSs) for the Federal Aviation Administration (FAA). The purpose of this report is to document the validity of the Air Traffic Selection and
Training (AT-SAT) battery according to legal and professional guidelines. The Dictionary of Occupational
Titles lists the Air Traffic Control Specialist Tower as
number 193162018.
Background
The ATCS position is unique in several respects. On
the one hand, it is a critically important position at the
center of efforts to maintain air safety and efficiency of
aircraft movement. The main purpose of the ATCS job
is to maintain a proper level of separation between
airplanes. Separation errors may lead to situations that
could result in a terrible loss of life and property. Given
the consequences associated with poor job performance
of ATCSs, there is great concern on the part of the FAA
to hire and train individuals so that air traffic can be
managed safely and efficiently. On the other hand, the
combination of skills and abilities required for proficiency in the position is not generally prevalent in the
labor force. Because of these characteristics, ATCSs
have been the focus of a great deal of selection and
training research over the years.
Historical events have played a major role in explaining the present condition of staffing, selection and
training systems for ATCSs. In 1981, President Ronald
Reagan fired striking ATCSs. Approximately 11,000 of
17,000 ATCSs were lost during the strike. Individuals
hired from August 1981 to about the end of 1984
replaced most of the strikers. A moderate level of new
hires was added through the late 1980s. However,
relatively few ATCSs have been hired in recent years due
to the sufficiency of the controller workforce. Rehired
controllers and graduates of college and university aviation training programs have filled most open positions.
#:314
of Personnel Management (OPM). A rank-ordered list

of candidates based on the OPM test scores was established. Candidates were listed according to their OPM
test score plus any veterans points. Candidates at the
top of the list were hired, provided they could pass
medical and security screening.
Once candidates were hired, they entered a nineweek screening program at the FAA Academy. Although
modified several times during the 1980s, the basic
program consisted of time spent in a classroom environment followed by work in laboratory-based, non-radar
simulations. The classroom phase instructed employees
on aircraft characteristics, principles of flight, the National Airspace System, and basic rules for separating
aircraft in a non-radar situation. During the ATCS
simulations phase, employees were taught and evaluated in an environment that emulated the work performed in an ATCS facility.
The OPM test had been in use, without revision,
since 1981. In addition, test taking strategies and coaching programs offered by private companies increased the
test scores of candidates without an apparent comparable increase in the abilities required to perform in the
screen. The artificial increase in test scores apparently
reduced the capability of the test to accurately identify
the highest qualified individuals to hire. Due at least in
part to the artificially inflated OPM scores, less than
40% of the ATCS trainees successfully completed the
nine-week screen. A full discussion of prior selection
procedures for ATCSs is provided in Chapter 6 on
Archival Data Analyses.
Research and development efforts were begun to
create a new selection device. One such research effort
was the Separation and Control Hiring Assessment
(SACHA) project initiated in 1991. SACHA focused
on performing a job analysis of the air traffic controller
position, developing ways to measure ATCS job performance, and identifying new tests suitable for selecting
controllers. The SACHA contract expired in 1996.
Another research and development effort, the PreTraining Screen (PTS), did produce a one-week selection test designed to replace the nine-week Academy
screening process. However, the validity of the PTS was
heavily weighted toward supervisor ratings and times to
complete field training, along with performance in the
Radar Training program. The FAA continued to use the
PTS to screen candidates at a time when there was severe
reduction in hiring, but there was no near-term poten-
tial to be hired. Meanwhile, the SACHA project was

already underway and was partly envisioned as the next
stage to the PTS.
In addition, the FAA had redesigned the initial
qualification training program in anticipation that the
PTS would filter candidates prior to their arrival at the
Academy. The nine-week-long screening process was
replaced with a program that focused on training, rather
than screening candidates for ATCS aptitudes. As a
result, the FAA had an initial training program but no
pre-hire selection system other than the OPM written test.
The purpose of the AT-SAT project was to develop
a job-related, legally defensible, computerized selection
battery for ATCSs that was to be delivered to the FAA
on October 1, 1997. The AT-SAT project was initiated
in October of 1996. The requirement to complete the
project within a year was dictated by the perceived need
to start selecting ATCS candidates in 1997.
Organization of Report
A collaborative team, made up of several contractors
and FAA employees, completed the AT-SAT project
and this report. Team members included individuals
from the Air Traffic Division of the FAA Academy and
Civil Aeromedical Institute (CAMI) of the FAA, Caliber, Personnel Decisions Research Institutes (PDRI),
RGI, and the Human Resources Research Organization
(HumRRO). The Air Traffic Division represented the
FAA management team, in addition to contributing to
predictor and criterion development. CAMI contributed to the design and development of the job performance measures. Caliber was the prime contractor and
was responsible for operational data collection activities
and job analysis research. PDRI was responsible for
research and development efforts associated with the job
performance measures and development of the Experience Questionnaire (EQ). RGI was responsible for
developmental activities associated with the Letter Factories Test and several other predictors. HumRRO had
responsibility for project management, predictor development, data base development, validity data analysis,
and the final report.
The final report consists of six chapters, with each
chapter written in whole or part by the individuals
responsible for performing the work. Contents of each
chapter is summarized below:
#:315
Chapter 1 - Introduction: contains an overview of

the project, including background and setting of the
problem addressed, and methodology used to validate
predictor measures.
Chapter 2 - Job Analysis: summarizes several job
analyses that identified the tasks, knowledges, skills, and
abilities required to perform the ATCS job. This chapter also contains a linkage analysis performed to determine the relationship between worker requirements
identified in the job analysis to the predictor measures
used in the validation study.
Chapter 3 - Predictor Development: focuses on how
the initial computerized test battery was developed from
job analysis and other information. This chapter also
discusses construction of multi-aptitude tests and alternative predictors used to measure several unique worker
requirements as well as the initial trial of the tests in a
sample of students in a naval training school.
Chapter 4 - Criterion Development: discusses the
development and construct validity of three criterion
measures used to evaluate ATCS job performance.
Chapter 5 - Validation of Predictors: presents the
predictor-criterion relationships, fairness analyses, and
a review of the individual elements considered in deciding on a final test battery.
Chapter 6 - Analyses of Archival Data: discusses the
results of analyses of historical data collected and maintained by CAMI and its relationship to AT-SAT variables.
ations went into the decision to perform the validation

study on en route controllers. Neither the development
of a common criterion measure nor separate criterion
measures for en route, tracon, and tower cab controllers
was compatible with completing the validity study
within one year. The solution was to limit the study to
the single en route specialty. The SACHA job analysis
concluded that there were not substantial differences in
the rankings of important worker requirements between
en route, tracon, and tower cab specialties. In addition,
considerable agreement was found between subactivity
ratings for the three specialties. Flight service, on the
other hand, appeared to have a different pattern of
important worker requirements and subactivity rankings
than the other options. The en route option was viewed
as reasonably representative of options that control air
traffic, i.e., en route, tracon, and tower cab specialists.
Further, the number of en route specialists was large
enough to meet the sample size requirements of the
validation study.
In addition, Step1 of the management plan included
the requirement of a pilot test of the Alpha battery on a
sample that was reasonably representative of the ATCS
applicant population. Data from the pilot sample would
be used to revise the Alpha test battery on the basis of the
results of analyses of item, total score, and test
intercorrelations. Beta, the revised test battery, was the
battery administered to en route ATCSs in the concurrent validity sample and pseudo applicant samples. Test
development activities associated with cognitive and
non-cognitive predictors, the pilot sample description,
results of the pilot sample analyses, and resultant recommendations for test modifications are presented in
Chapter Three.
Design of Validity Study

Step 1: Complete Predictor Battery Development
The tasks, knowledges, skills, and abilities (worker
requirements) of the air traffic control occupation were
identified through job analysis. Several prototype predictor tests were developed to cover the most important
worker requirements of ATCSs. The management team
held a predictor test review conference in November
1996 in McLean, Virginia. At the meeting, all prototype
predictor tests were reviewed to determine which were
appropriate and could be ready for formal evaluation in
the validity study. Twelve individual predictor tests
were selected. Step 1 of the management plan was to
complete the development of the prototype tests and
combine them into a battery that could be administered
on a personal computer. This initial test battery was
designated the Alpha battery.
It was also decided at this meeting to limit the
validation effort to a sample of full performance level en
route ATCSs to help ensure that the validity study
would be completed within a year. Several consider-
Step 2: Complete Criterion Measure Development

Three job performance measures were developed to
evaluate en route job performance. By examining different aspects of job performance, it was felt that a more
complete measure of the criterion variance would be
obtained. The three measures included supervisor and
peer ratings of typical performance, a computerized job
sample, and a high-fidelity simulation of the ATCS job.
Because the high-fidelity simulation provided the most
realistic environment to evaluate controller performance,
it was used to evaluate the construct validity of the other
two criterion measures. The research and development
effort associated with the criterion measures is presented
in Chapter 4.
#:316
Step 3: Conduct Concurrent Validation Study

The job relatedness of the AT-SAT test battery was
demonstrated by means of a criterion-related validity
study. By employing the criterion-related validation
model, we were able to demonstrate a high positive
correlation between test scores on AT-SAT and the job
performance of a large sample of en route ATCSs.
Because of the amount of time required for ATCSs to
reach full performance level status, i.e., two to four
years, and the project requirement of completion within
a year, a concurrent criterion-related design was employed in the AT-SAT study. In a concurrent validation
strategy, the predictor and job performance measures
are collected from current employees at approximately
the same time.
The original goal for the number of total participants
in the study was 750 en route ATCSs, including 100
representatives from each of the major protected classes.
Over 900 pairs of predictor and criterion cases were
collected in Phase I of the concurrent study. However,
the goal of collecting 100 African American and Hispanic ATCS cases was not achieved. As a consequence,
the FAA decided to continue the concurrent validity
study to obtain a greater number of African American
and Hispanic study participants. These additional data
were required to improve statistical estimates of fairness
of the AT-SAT battery. In Phase II, data were collected
from en route sites that had not participated in Phase I.
In addition, a second request for study participation was
made to ATCSs in sites that had been a part of Phase I.
All 20 en route sites participated in the AT-SAT study.
It should be noted that because of an ATCS employee
union contract provision, each study participant had to
volunteer to be included in the study. Consequently, the
completion of the study was totally dependent on the
good will of the ATCSs, and a significant amount of
effort was expended in convincing them of the need and
value of their participation in the AT-SAT project. A
similar effort was directed at employee groups representing the protected classes. In the final analysis, however, each study participant was a volunteer. The FAA
had no real control over the composition of the final
ATCS sample. The data collection effort was anticipated to be highly intrusive to the operations of en route
centers. There was substantial difficulty associated with
scheduling and arranging for ATCS participation. The
FAAs Air Traffic Training management team had the
responsibility to coordinate acceptance and participation in the study of all stakeholders. The demographics
of the obtained samples, corrected and uncorrected

results of predictor and criterion analyses, and group
difference and fairness analyses are discussed in Chapter 5.
Step 4: Conduct Pseudo-Applicant Study
Full-performance-level ATCSs are a highly selected
group. As indicated earlier, even after the OPM selection battery was used to select candidates for ATCS
training, there was still a 40% loss of trainees through
the Academy screen and another 10% from on-the-job
training. Under these conditions, it was highly likely
that the range of test scores produced by current ATCSs
would be restricted. Range restriction in predictor scores
suggests that ATCSs would demonstrate a lower degree
of variability and higher mean scores than an unselected
sample. A restricted set of test scores, when correlated
with job performance measures, is likely to underestimate the true validity of a selection battery. Therefore, to obtain validity estimates that more closely
reflected the real benefits of a selection battery in an
unselected applicant population, validity coefficients
were corrected for range restriction. Range restriction
corrections estimate what the validity estimates would
be if they had been computed on an unselected, unrestricted applicant sample.
One method of obtaining the data required to perform the range restriction corrections is to obtain a
sample of test scores from a group of individuals that is
reasonably representative of the ATCS applicant pool.
The best sources of data for this purpose are real
applicants, but this information would not become
available until the test battery was implemented. Both
military and civilian pseudo-applicant samples were
administered the AT-SAT battery for the purpose of
estimating its unrestricted test variance and correcting
initial validity estimates for range restriction. Pseudo
applicant data were also used to obtain initial estimates
of potential differences in test scores due to race and
gender. The range restriction corrections resulted in
moderate to large improvements in estimates of validity
for the cognitive tests. These results are shown in
Chapter 5. The final determination of these corrections
will, by definition, require analyses of applicant data.
Step 5: Analyses and Validation of Predictors
Data management was a particularly critical issue on
the AT-SAT project. Plans to receive, log in, and process
data from 15 sites over an eight-week period were
created. In addition, a final analysis database was devel-
#:317
oped so that the validity analyses of the predictors could

be completed within a two-week time frame. Plans were
also made on the methodology used to determine the
validity of the predictors. These included predictorcriterion relationships and reviews of the individual
elements to consider when deciding on the final test
composite. There was a need to include special test
composite analyses examining the interplay of differences between groups, the optimization of particular
criterion variables, and coverage of worker requirements
and their effect on validity. In particular, the final
composition of the AT-SAT battery represented a combination of tests with the highest possible relation to job
performance and smallest differences between protected
classes. All validity and fairness analyses are presented in
Chapter 5.
Step 6: Deliver Predictor Battery

and Supporting Documentation
The final deliverable associated with the AT-SAT
project was the AT-SAT test battery, version 1.0, on a
compact disc (CD). The goal of developing a selection
test battery for the ATCS that was highly job related and
fair to women and minorities was achieved. Included
with the CD are source code, documentation, and a
users manual. In addition, a database containing all raw
data from the project was provided to the FAA.
#:318
CHAPTER 2
AIR TRAFFIC CONTROLLER JOB ANALYSIS
Ray A. Morath, Caliber Associates
Douglas Quartetti, HumRRO
Anthony Bayless, Claudet Archambault
Caliber Associates
Computer Technologies Associates (CTA)
CTA conducted a task analysis of the ARTCC,
TRACON, and Tower Cab assignments with the goal
not only of understanding how the jobs were currently
performed but also of anticipating how these jobs would
be performed in the future within the evolving Advanced Automation System (AAS).1 They sought to
identify the information processing tasks of ARTCC,
TRACON, and Tower Cab controllers in order to help
those designing the AAS to gain insight into controller
behavioral processes (Ammerman et al., 1983).
An extensive assortment of documents was examined
for terms suitable to the knowledge data base, including
FAA, military, and civilian courses. Listed below are the
sources of the documents examined for ATCS terms
descriptive of knowledge topics and technical concepts:
PRIOR JOB ANALYSES

The foundation for the development of the AT-SAT
predictor battery, as well as the job performance measures, was the Separation and Control Hiring Assessment (SACHA) job analysis (Nickels, Bobko, Blair,
Sands, & Tartak, 1995). This traditional, task-based
job analysis had the general goals of (a) supporting the
development of predictor measures to be used in future
selection instrumentation, (b) supporting the identification of performance dimensions for use in future
validation efforts, and (c) identifying differences in the
tasks and worker requirements (WRs; knowledges, skills,
abilities, and other characteristics, or KSAOs) of the
different ATCS options (Air Route Traffic Control
Center, ARTCC; Terminal, and Flight Service) and
ATCS job assignments (ARTCC, Terminal Radar Approach Control, TRACON; Tower Cab, and Automated Flight Service Station, AFSS).
Civilian publications
Community college aviation program materials
Contractor equipment manuals
FAA Advisory Circulars
FAA air traffic control operations concepts
FAA documents
FAA orders
Local facility handbooks
Local facility orders
Local facility training guides and programs
NAS configuration management documents
National Air Traffic Training Program (manuals,
examinations, lesson plans, guides, reference materials,
workbooks, etc.)
Naval Air Technical Training Center air traffic controller training documents
U.S. Air Force regulations and manuals
Review of Existing ATCS Job Analysis Literature

Nickels et al. (1996) began by reviewing and consolidating the existing ATCS job analysis literature. This
integration of the findings from previous job-analytic
research served as the initial source of information on
ATCS jobs prior to any on-site investigations and
helped to focus the efforts of the project staff conducting
the site visits. A core group of job analysis studies also
provided much of the information that went into developing preliminary lists of tasks and WRs for the SACHA
project. The following is a review of the major findings
from selected studies that were most influential to
SACHA and provided the greatest input to the preliminary task and WR lists.
Alexander, Alley, Ammerman, Fairhurst, Hostetler, Jones, & Rainey, 1989; Alexander, Alley, Ammerman, Hostetler, & Jones,
1988; Alexander, Ammerman, Fairhurst, Hostetler, & Jones, 1989; Alley, Ammerman, Fairhurst, Hostetler, & Jones, 1988;
Ammerman, Bergen, Davies, Hostetler, Inman, & Jones, 1987; Ammerman, Fairhurst, Hostetler, & Jones, 1989.
#:319
Detailed task statements from each controller option

were organized hierarchically into more global and
interpretable subactivity and activity categories. Within
CTAs framework, one or more tasks comprised a
subactivity, with multiple subactivities subsumed under a single activity. Hence, this approach generated
three levels of job performance descriptors. It was found
that controllers in each of the three ATCS options
(ARTCC, TRACON, and Tower Cab) performed about
6-7 of the more general activities, approximately 50 subactivities, and typically several hundred tasks.
The results of the CTA task analysis indicated that
activity categories as well as subjectivity categories were
similar across the assignments of ARTCC, TRACON,
and Tower Cab, with only small variations in the tasks
across the options. These findings suggested that the
more global activities performed in each of these three
controller jobs are almost identical. Additionally, job
analysis results revealed 14 cognitively oriented worker
requirements (WRs) that were found to be critical in the
performance of tasks across the three assignments. These
WRs were:
The general findings of the cognitive task analysis

were that ARTCC controllers mental models for the
control of air traffic could be broken down into three
general categories, which were termed (a) sector management, (b) prerequisite information, and (c) conditions. These three categories roughly parallel the
respective information processing requirements of shortterm memory, long-term memory, and switching mechanisms. The resulting 12 cognitively oriented tasks were:
Another important finding from the study was that

due to their more effective working memory, experts
have access to more information than novices. That is,
experts have a greater chunking capacity. Experts are
also more efficient in the control of aircraft because they
typically use a smaller number of strategies per traffic
scenario, and have a greater number of strategies that
they can employ. Finally, the study found that expert
ARTCC controllers differed from novices in their performance on the two most important cognitive tasks:
maintaining situational awareness and revising the sector control plan.
HTIs work also involved investigating the communications within teams of radar AT-SAT (radar and
associate radar) as well as the communications between
radar associates and controllers in other sectors. Teams
were studied in both live and simulated traffic situations. Communication data were coded in relation to
12 major controller tasks that were found in the cognitive task analysis. The data indicated that nearly all
communication between team members concerned the
two tasks deemed most critical by the cognitive task
analysis: maintain situational awareness, and develop
and revise sector control plan.
A summary job analysis document (HTI, 1993)
presents all the linkages from seven sets of prototype
documents representative of air traffic controller job
Coding
Decoding
Deductive reasoning
Filtering
Image/pattern recognition
Inductive reasoning
Long-term memory
Mathematical/probabilistic reasoning
Movement detection
Prioritizing
Short-term memory
Spatial scanning
Verbal filtering
Visualization
Human Technologies, Inc. (HTI)

HTI (1991) conducted a cognitive task analysis with
ARTCC controllers to analyze mental models and
decision-making strategies of expert controllers. An
additional goal was to determine the effect of controller
experience and expertise on differences in controller
knowledges, skills, mental models, and decision strategies. Cognitive task analysis was performed by videotaping controllers during various traffic scenarios and
having them describe in detail what they were thinking
while they were handling the traffic scenarios.
Maintain situational awareness

Develop and revise sector control plan
Resolve aircraft conflict
Reroute aircraft
Manage arrivals
Manage departures
Receive handoff
Receive pointout
Initiate handoff
Initiate pointout
Issue advisory
Issue safety alert
#:320
analyses. The objective of this summary process was to

systematically combine the results from the air traffic
controller job analyses into a single document that emphasizes the task-to-KSAO linkages for the jobs of En Route,
Flight Service Station, combined TRACON and Tower
(Terminal), Tower, and TRACON controllers.
The results reported in the HTI summary job analysis were based on individual job analysis summaries,
which included a cognitive task analysis and the use of
the Position Analysis Questionnaire (Meecham &
McCormick, 1969). The HTI analysis also utilized the
CTA task analysis that had standardized task and KSAO
data from existing air traffic control job analyses. In the
individual job analyses, the controller tasks and KSAO
data were translated into Standard Controller Taxonomies. Then, the linkages for each controller job were
identified and placed in standard matrices based on
these taxonomies.
The Task Taxonomy includes a total of 41 task
categories grouped as follows:
non-radar training screen utilized with ARTCC and

Terminal option controllers. The five global activities
identified by Embry-Riddle investigators were:
Setting up the problem

Problem identification
Problem analysis
Resolve aircraft conflicts
Manage air traffic sequences
Upon the basis of the results of the task inventory,

existing documentation, and the information obtained
from meetings with training instructors, the following
18 attributes were identified as critical in performing the
activities and tasks involved in the training:
Perceptual Tasks
Discrete Motor Tasks
Continuous Psychomotor Tasks
Cognitive Tasks
Communication Tasks
The KSAO Taxonomy has a total of 48 KSAO

categories divided into the following three groupings:
Abilities
Knowledge
Personality Factors
This resulted in a 41-by-48 task-to-KSAO matrix
that permits the standard listing of task-to-KSAO linkages from different job analyses.
The summary document (HTI, 1993), which incorporated the individual job analyses as well as the CTA
report, was reviewed and utilized by the AT-SAT researchers in determining the contribution of these reports to the
understanding of the air traffic controller job.
Spatial visualization
Mathematical reasoning
Prioritization
Selective attention
Mental rotation
Multi-task performance (time sharing)
Abstract reasoning
Elapsed time estimation and awareness
Working memory - attention capacity
Working memory - activation capacity
Spatial orientation
Decision making versus inflexibility
Time sharing - logical sequencing
Vigilance
Visual spatial scanning
Time-distance extrapolation
Transformation
Perceptual speed
In addition to identifying the tasks and abilities

required for success in training, another goal of this
project was to determine the abilities necessary for
success on the ATCS job. The Embry-Riddle team
employed Fleishmans ability requirements approach
for this purpose. Utilizing the task-based results of the
CTA job analysis (Ammerman et al., 1987), they had
ARTCC, TRACON, and Tower Cab controllers rate
CTA tasks on the levels of abilities needed to successfully perform those CTA-generated tasks. Using
Fleishmans abilities requirements taxonomy (Fleishman
& Quaintance, 1984), these subject matter experts
(SMEs) rated the levels of perceptual-motor and cognitive abilities required for each of the tasks. It was found
that the abilities rated by controllers as critical for
Embry-Riddle
Using a hierarchical arrangement of activities and
tasks borrowed from CTA, Embry-Riddle researchers
(Gibb et al., 1991) found that five activities and 119
tasks subsumed under those more global activities were
identified as critical to controller performance in the
#:321
controller performance were highly similar to those

found by the CTA study; they were also quite similar to
those abilities identified by the Embry-Riddle team as
important to success in the non-radar training screen.
ARTCC controllers rated the following abilities from
Fleishmans scales as necessary to perform the CTAgenerated ARTCC tasks:
I. Information Input Tasks

Receive, interpret, compare and filter information
Identify information needing storage or further processing
A. Scanning and monitoring
B. Searching
II. Processing Tasks
Organize, represent, process, store, and access

information
A. Analytical planning
B. Maintain picture in active memory
C. Long-term memory
D. System regulation
III. Action/Output Tasks
Deductive reasoning
Inductive reasoning
Long-term memory
Visualization
Speed of closure
Time sharing
Flexibility of closure (selective attention)
Category flexibility
Number facility
Information ordering
Physical and verbal actions to communicate and

record information
A. Communicate outgoing messages
B. Update flight records
C. Operate controls, devices, keys, switches
Those abilities rated by Terminal controllers as required to perform the Terminal option tasks were:
Selective attention
Time sharing
Problem sensitivity
All of Fleishmans physical abilities related to visual,
auditory, and speech qualities
- Oral expression
-Deductive reasoning
- Inductive reasoning
- Visualization
- Spatial orientation
- All perceptual speed abilities.
Myers and Manning

Myers and Manning (1988) performed a task analysis of the Automated Flight Service job for the purpose
of developing a selection instrument for use with AFSS.
Using the CTA hierarchy to organize the tasks of the
job, Myers and Manning employed SME interviews
and surveys to identify the activities, subactivities, and
tasks of the job. They found 147 tasks, 21 subactivities,
and the five activities that they felt comprised the jobs of the
AFSS option. The activities that they identified were:
The Embry-Riddle researchers presented no discussion on why differences in abilities between ARTCC
and Terminal controllers were found.
Landon
Landon (1991) did not interview SMEs, observe
controllers, or canvass selected groups to collect job
analysis information. Rather, Landon reviewed existing
documents and job analysis reports and summarized
this information. Landons focus was to identify and
classify the types of tasks performed by controllers.
Using CTAs hierarchical categorization of tasks, the
ATCS tasks were organized into three categories based
upon the type of action verb within each task:
Process flight plans

Conduct pilot briefing
Conduct emergency communications
Process data communications
Manage position resources
Using the subactivities as their focus, Myers and

Manning identified those WRs required to successfully
perform each subactivity. Unlike the CTA and EmbryRiddle job analyses, the WRs identified were much
more specific in nature, as evidenced by the following
examples:
10
Ability to operate radio/receive phone calls

Ability to use proper phraseology
Ability to keep pilots calm
Ability to operate Model 1 equipment
#:322
Summary of Previous Studies

SACHA project staff summarized the findings from
the previous job analyses and identified the commonalities across those reports regarding the tasks and worker
requirements. Specifically, they compared CTAs worker
requirements with those reported by Embry-Riddle.
Additionally, once the SACHA-generated lists were
completed, the researchers mapped those worker requirements to those reported by CTA. In general, the
global categories of tasks and the hierarchical organization of tasks for the ARTCC and Terminal options were
common across the previous studies. Additionally, the
sub-activities and worker requirements identified in
previous research for those two ATCS options were
similar. Finally, the previous job analyses illustrated
differences in tasks and worker requirements between
the AFSS option and the other two options.
job assignments of ARTCC, TRACON, Tower Cab,

and AFSS, with task-based findings from CTA and the
Myers and Manning results serving as the primary
source of task-based information for the respective
options. Each of these lists contained from six to eight
global activities, 36 to 54 subactivities, and hundreds of
tasks.
Several steps were followed in the development of the
list of ATCS worker requirements. On the basis of their
review of existing literature and their knowledge of
those relevant KSAO constructs, SACHA project staff
developed an initial list of 228 WRs. They then conducted a three-day workshop dedicated to refining this
initial list. Consensus judgments were used to eliminate
WRs that were thought to be irrelevant or redundant.
Finally, a single preliminary WR list was formulated
that contained 73 WRs grouped into 14 categories
(reasoning, computational ability, communication, attention, memory, metacognitive, information processing, perceptual abilities, spatial abilities, interpersonal,
work and effort, stability/adjustment, self-efficacy, and
psychomotor). This list of WRs is presented in Table
2.1 and will henceforth be termed the SACHA-generated WRs list.
SACHA Site Visits

After reviewing and summarizing the existing job
analysis information, the SACHA project staff visited
sites to observe controllers from the various options and
assignments. More than a dozen facilities, ranging from
ARTCC, Level II to V Tower Cab and TRACON, and
AFSS facilities, were visited. The primary purpose of
these initial site visits was to gain a better understanding
of the ATCS job. SACHA project staff not only observed the controllers from the various options performing their job, but they also discussed the various
components of the job with the controllers, their trainers, and supervisors.
Panel Review of Preliminary Lists

A panel of five controllers assigned to FAA Headquarters participated in a workshop to review and revise
the SACHA materials and procedures for use in additional job analysis site visits. These five controllers, who
represented each of the options of ARTCC, Terminal,
and Flight Service, began the workshop by undergoing
a dry run of the planned field procedures in order to
critique the procedures and offer suggestions as to how
they might be improved. Second, the panel reviewed
and edited the existing task lists for the various options,
mainly to consolidate redundant task statements and
clarify vague statements. Finally, the panel reviewed the
SACHA-generated WRs list. Upon discussion, the panel
made no substantive changes to either the task lists for
the various options or the SACHA-Generated WRs list.
Development of Task and WR Lists

Developing Preliminary Lists
On the basis of the results of the previous job analyses
as well as the information obtained from the site visits,
SACHAs project staff developed preliminary task and
WR lists. Given the strengths of the CTA job analysis
regarding (a) its level of specificity, (b) its hierarchical
arrangement of tasks, and (c) its focus on both the
current ATCS job and how the job is likely to be
performed in the future, SACHA decided to use the task
analysis results of CTA as the basis for preliminary task
lists with the options of ARTCC, TRACON, and
Tower Cab. Similarly, Myers and Manning performed
a relatively extensive job analysis of the AFSS position,
which had been modeled after CTA; they, too, had used
a hierarchical categorization of tasks and had a high
degree of specificity at the molecular, task level. Hence,
four preliminary task lists were developed for the ATCS
Developing and Revising Task Lists in the Field

In field job analysis meetings held in different regions, the preliminary task lists were presented to sevennine SMEs (controllers with varying levels of job
experience) from each of the four options (ARTCC,
TRACON, Tower Cab and AFSS). Special attention
was placed upon having groups of SMEs who were
diverse in race/ethnicity, gender, and years of experience
11
#:323
controlling traffic, and who represented various levels of

ATC facilities. Attempts were made to avoid mixing
subordinates and their supervisor(s) in the same meeting.
Project staff instructed SMEs to review their respective task list (whether it be ARTCC, TRACON, Tower
Cab, or AFSS) to confirm which tasks were part of their
job and which were irrelevant. In addition, SMEs were
asked to consolidate redundant tasks, to add important
tasks, and to edit any task statements that needed
rewording for clarification or correction of terminology. SMEs proceeded systematically, reviewing an entire group of tasks under a subactivity, discussing the
necessary changes, and coming to a consensus regarding
those changes before moving to the next subactivity.
After editing of the tasks was completed, SMEs were
asked to identify those tasks that they believed were
performed by all ATCS options, as well as those tasks
specific to one or more ATCS position(s).
These meetings produced four distinct task lists
corresponding to ARTCC, TRACON, Tower Cab,
and AFSS controllers.
best sources of information when it came to identifying

these types of WRs. Because (as SACHA researchers stated)
no common language existed with which to discuss these
types of WRs, project staff did not try to pursue defining
these types of WRs with the controller SMEs.
Linking Tasks to SME-Generated WRs
SME meetings were then held to have controllers
provide linkage judgments (obtained via group discussion) relating the tasks subsumed under a particular
subactivity to the SME-generated WRs required to
perform that subactivity. SMEs from the ARTCC and
Terminal options reviewed the task and SME-generated
WR lists from their respective options and identified
those WRs needed to perform each subactivity. SMEs
focused upon one subactivity at a time and obtained
consensus regarding the most important WRs for that
subactivity before moving on to the next. Linkages were
made at the subactivity level because the large number
of tasks precluded linkages being made at the task level.
Due to scheduling problems, the SACHA project staff
were unable to hold a linkage meeting with AFSS SMEs,
so no data were obtained at this stage linking AFSS tasks
to AFSS WRs.
While two SME-generated WRs (Motivation and
Commitment to the Job) were not linked to any of the
subactivities, controllers stated that these two WRs were
related to the overall job. Thus, even though these WRs
could not be directly linked to the tasks of any specific
subactivity, controllers felt that their importance to
overall job performance justified the linkage of these
two requirements to every subactivity. Results of the
linkage meetings revealed that every SME-generated
WR could be linked to at least one subactivity and that
each subactivity was linked to at least one WR.
Developing SME-Generated WRs Lists

SME meetings were also held for the purpose of
having controllers generate their own lists of WRs that
they felt were necessary for effective job performance.
SMEs were not given the preliminary WR list that had
been generated by SACHA job analysts but were instructed to generate their own list of skill and abilityrelated WRs. SMEs were utilized to identify and define
WRs while project staff assisted by clarifying the differences and similarities among the WR definitions. That
is, project staff tried to facilitate the development of the
WRs without influencing the SMEs judgments.
As a result of this effort, SME controllers across the
three options of ARTCC, Terminal, and Flight Service
generated 47 WRs. Based upon the input from the
SMEs, definitions were written for each of the WRs. It
was concluded that all 47 SME-generated WRs would
be applicable to any position within the three ATCS
options. Table 2.2 represents the list of 47 SMEgenerated WRs.
When the SACHA-generated and SME-generated
WR lists are compared, they appear to be quite similar
except for the lack of metacognitive and information
processing WRs in the SME-generated list. SACHA
staff reported that controllers generating the SME list of
WRs lacked familiarity with metacognitive and information processing constructs and were probably not the
Developing Consolidated List of WRs

At this stage, SACHA job analysts generated a consolidated WR list combining the SME (controller) and
SACHA-generated WRs. They began the consolidation
process by including 45 of the 47 original SME-generated WRs. Two of the original 47 were dropped (Aviation Science Background and Geography) because they
were job knowledges rather than skills or abilities. Next,
the SACHA-generated list of WRs was reviewed to add
WRs that had not been identified in the SME-generated
list but were considered important to the job of ATCS.
Finally, the project staff added two WRs (Recall from
Interruption and Translation of Uncertainty into Prob-
12
#:324
ability) that had not been identified in either the SACHA

or the controller lists but were deemed necessary to
perform the job from job analytic suggestions.
Thus, a final consolidated list of 66 WRs was created
(45 SME-generated and 21 SACHA-generated) that
included skills and abilities in the areas of communication, computation, memory, metacognition, reasoning,
information processing, attention, perceptual/spatial,
interpersonal, self-efficacy, work and effort, and stability/adjustment (Table 2.3).
also created by combining the importance and relative

time spent ratings. This index provided an indication of
the relative criticality of each subactivity with respect to
job performance.
The WR rating section of the survey was comprised
of 67 items, which included (a) the 45 controllergenerated items, (b) the two SME-generated job
knowledges that had been reintroduced into the survey,
(c) the 16 SACHA-generated items (five of the 21
SACHA-generated items dealing with information processing were left off the survey due to controllers lack of
understanding and familiarity with these constructs),
and four items to identify random responses. Respondents were instructed to rate each of the 67 WRs on both
its relative importance in learning the job and its relative
importance in doing the job.
Job Analysis Survey

Utilizing the information gained from the site visits
and SME meetings, the SACHA staff developed a job
analysis survey and disseminated it to a cross-section of
ATCSs from the various options and assignments located throughout the country. The main goals of the
mail-out survey were to identify the most important
WRs for predictor development, to explore criterion
measures, and to identify possible differences in the
subactivities being performed across job assignments.
Of the 1009 surveys sent out to ATCSs in February
1994, 444 were returned, with usable data obtained
from 389 respondents.
Overview of Survey Findings

WRs. Findings revealed very little difference between the WRs seen as important for doing the job and
those needed to learn the job. Rank orderings of the WR
mean scores for doing and learning the job were highly
similar. This result appeared to hold across job options
and job assignments. Mean rankings of the WRs for all
ATCS job assignments are shown in Table 2.4. These
scores reflect the mean rankings of the WRs for learning
and for doing the job.
The results also suggested that, while there seemed to
be no substantial difference between the WR ratings of
the ARTCC and the Terminal option controllers
(TRACON and Tower Cab), the Flight Service controllers appeared to rate the WRs differently. They rated
WRs dealing with information collection and dissemination as relatively more important than did the ARTCC
and Terminal option controllers, and rated WRs dealing with metacognitive functions as relatively less important.
As a result of the findings, SACHA staff felt that there
were no substantive differences between the ARTCC
and the Terminal options in the ordering of the WRs,
which would influence predictor development for these
two options. However, they advised that any future
work dealing with test development for the Flight
Service option should take into consideration their
different rank ordering of WRs. Tables 2.5 and 2.6 list
the mean ratings of the WRs (from each of the four job
assignments) for doing and learning the job.
Content of the Survey

The survey was divided into four sections: an introduction, a subactivity ratings section, a WRs rating
section, and a background information section. The
introduction explained the purpose of the survey to the
ATCSs, provided instructions on completing the survey, and encouraged participation. The background
section gathered information on such things as the
respondents gender, race, job experience, facility type
and facility level.
The subactivity rating section was comprised of 108
entries, the combined list of all sub-activities across the
ATCS options of ARTCC, Terminal, and Flight Service. The instructions informed respondents that the
survey contained subactivities from ARTCC,
TRACON, Tower Cab, and AFSS jobs, and thus it
would be unlikely that all entries would be relevant for
a particular job assignment. Respondents were asked to
rate each subactivity on (a) its importance in successfully performing their job, and (b) the time spent
performing this subactivity relative to the other job
duties they perform. A single task criticality index was
13
#:325
Subactivities. As with the ratings of the WRs, the

results of the subactivity ratings revealed that ARTCC
and Terminal option controllers shared similar profiles
with respect to the relative criticality of the subactivities.
While the ARTCC and Terminal option controllers
share more common subactivities than they do each
share with the Flight Service option, 11 subactivities
were given relatively high ratings by all three options.
These common sub-activities were associated with the
safe and expeditious flow of traffic, as well as responding
to emergencies or special conditions and contingencies.
Table 2.7 contains the ranked mean rating of the subactivities across all ATCS options.
The SACHA staff also felt that another important
finding from the controller ratings of the subactivities
was that, regardless of job option or assignment, those
subactivities dealing with multitasking were consistently seen as important to the ATCS job. The project
staff operationalized multitasking as those times when
controllers must (a) perform two or more job tasks
simultaneously, (b) continue job tasks despite frequent
interruptions, and (c) use multiple sensory modalities to
collect information simultaneously or near simultaneously. When dealing with the ratings of the
subactivities across all ATCS options, it was found that
ten of the 11 sub-activities dealing with multitasking
had criticality scores that placed them in the top third of
all subactivities.
Linkage of Predictors to Work

Requirements Overview
To determine whether the various instruments comprising the AT-SAT predictor battery were indeed measuring the most important WRs found in the SACHA
job analysis, linkage judgments were made by individuals familiar with the particular AT-SAT measures, as
well as the WRs coming out of SACHA. Linkage
analysis data were collected through surveys. Surveys
contained (a) a brief introduction describing why the
linkage of tests to WRs was necessary, (b) a brief
background questionnaire, (c) instructions for completing the linkage survey, (d) definitions of the WRs
from SACHAs revised consolidated list, and (e) linkage
rating scales for each of the AT-SAT measures. Survey
results showed that each of the measures comprising the
AT-SAT battery was successfully capturing at least one
or more WRs from SACHAs revised consolidated list.
Additionally, the vast majority of those WRs being
captured by AT-SAT measures were those SACHA
found to be most important for both learning and doing
the job.
The linkage analysis made use of 65 of the 66 WRs
from SACHAs final revised consolidated list. Due to an
oversight, two of the WRs from SACHAs list were
labeled Rule Application (one SME-generated and the
other SACHA-generated), and both were listed under
the Information Processing category. When the SACHA
list of WRs and their respective definitions were being
transcribed for use in the linkage analysis, only one of
the two Rule Application WRs was transcribed. Hence,
the linkage analysis collected linkage ratings only on the
SACHA-generated version of Rule Application, defined as the ability to efficiently apply transformational
rules inferred from the complete portions of the stimulus array to the incomplete portion of the array. The
SME-generated version of Rule Application, which was
defined as the ability to apply learned rules to the real world
work situation, was not included in the linkage survey.
Conclusions
Considering the results of the SACHA job analysis
survey and taking into account the goals of this selection-oriented job analysis, the project staff arrived at
several general conclusions.
There appeared to be no substantial differences in the
rankings of the important WRs between ARTCC,
TRACON, and Tower Cab controllers. However, the
differences in the rankings found between Flight Service
option controllers and the other options did appear to
be substantive enough that any future efforts to develop
selection instrumentation should take these differences
into account.
Considerable agreement was found between the
subactivity rankings for the ARTCC, TRACON, and
Tower Cab controllers, while the rank ordering of the
subactivities for the Flight Service option appears to be
different from all other options and job assignments.
Regardless of job option or assignment, multitasking
is an important component of the ATCS job.
Respondent Background Questionnaire

Project staff created the 7-item background questionnaire to be completed by the survey respondents.
Items measured the respondents highest educational
degree, area of study, experience in data collection,
experience in test construction, familiarity with ATSAT, role in developing AT-SAT predictors and/or
criterion measures, and familiarity with ATCS job. One
14
#:326
purpose of the items was to serve as a check in making

sure the individuals were qualified raters. Additionally,
these items could serve to identify subgroups of raters
based upon such things as testing experience, educational background, and/or educational degree. In the
event that rater reliability was low, attempts could be
made to determine whether the lack of rater agreement
was due to any one of these subgrouping variables.
scales (but did not receive the construct labels for these
scales). Respondents were to use the items comprising
each scale to determine the construct being measured by
that particular scale and then make their ratings as to the
degree to which the scale successfully measured each WR.
Definitions of WRs
The survey contained an attachment listing the WRs
and their accompanying definitions from SACHAs
revised consolidated WR list (except for the SMEgenerated WR of Rule Application). It was felt that, in
order for respondents to make the most informed linkage rating between a test and a WR, they should not only
have a clear understanding of the properties of the test,
but also possess a firm grasp of the WR. Survey respondents were instructed to read through the attachment of
WRs and their respective definitions before making any
linkage ratings and to refer back to these definitions
throughout the rating process (Table 2.8).
Descriptions of AT-SAT Measures

The AT-SAT test battery used in the concurrent
validity study contained the following 12 predictor
measures:
Dials
Sound
Letter Factory
Applied Math
Scanning
Angles
Analogies
Memory
Air Traffic Scenarios
Experience Questionnaire
Time Wall/Pattern Recognition
Planes
Survey Respondents
To qualify as raters, individuals had to be familiar
with the measures comprising the AT-SAT battery, and
they had to have an understanding of each of the WRs
being linked to the various measures. Potential respondents were contacted by phone or E-mail and informed
of the nature of the rating task. A pool of 25 potential
respondents was identified. The individuals in this pool
came primarily from the organizations contracted to
perform the AT-SAT validation effort but also included
FAA personnel directly involved with AT-SAT.
An important element of the linkage survey consisted

of operational descriptions of each of the AT-SAT
predictor tests. These descriptions were meant not to
replace the respondents familiarity and experience with
each of the measures but to highlight the most basic
features of each measure for each respondent. While one
of the criteria for inclusion as a survey respondent was
a familiarity with each of the measures (typically gained
through helping to create and/or taking the test), it was
felt that a general description of the features of each
measure would facilitate a more complete recall of the
test characteristics. Respondents were instructed to read
each test description before rating the degree to which
that test measures each of the WRs. Figure 2.1 is an
example of a description for one of the AT-SAT measures and its accompanying rating scale.
The only predictor measure for which no operational
description was provided was the Experience Questionnaire (EQ). This measure was a biodata inventory
comprised of 14 subscales, with individual subscales
containing anywhere from 9 to 15 items. In the place of
test descriptions, respondents making linkage ratings
on the EQ received the actual items from the individual
Survey Methodology
Those who had agreed to participate in the linkage
process received the packet of rating materials via regular mail. Each packet contained the following items:
(1) An introduction, which outlined the importance of
linking the AT-SAT predictor tests to the WRs identified in the SACHA job analysis. It included the names
and phone numbers of project staff who could be
contacted if respondents had questions concerning the
rating process.
(2) The 7-item background questionnaire.
(3) The attachment containing the list of WRs and
their definitions.
15
#:327
(4) Rating scales for each of the AT-SAT tests. Each

rating scale contained the operational description of the
measure (or for the EQ, those items comprising an EQ
sub-scale), the Likert scale response options, and the
WRs to be rated (Figure 2.2.).
Scale Reliability
Reliability indices were computed for each rating
scale. Scale reliabilities ranged from .86 to .96. Hence,
the intraclass correlations (Shrout & Fleiss, 1979) for
each of the rating scales revealed a high level of agreement between the respondents as to which WRs were
being successfully measured by the respective tests.
These reliability coefficients are listed in Table 2.9. In
view of the high level of agreement, it appeared that such
factors as the raters highest educational degree, educational
background, and familiarity with the ATCS job did not
influence the level of agreement among the raters.
In view of the inordinate amount of time it would

take a respondent to rate all the WRs on each of the 12
tests, tests were divided in half, with one group of
respondents rating six tests and the other group rating
the other six tests. The 25 potential raters who had been
identified were split into two groups. Thirteen respondents were responsible for linkage ratings for the Angles,
Analogies, Memory, AT Scenarios, Planes, and Experiences Questionnaire; the remaining 12 were to make
linkage ratings for the Dials, Sound, Letter Factory,
Applied Math, Scanning, and Time Wall. The subset of
tests sent to the 13 respondents was labeled Version 1,
and the second subset of tests sent to the remaining 12
respondents was labeled Version 2.
Angles
The Angles test measures the participants ability to
recognize angles. This test contains 30 multiple-choice
questions and allows participants up to 8 minutes to
complete them. The score is based on the number of
correct answers (with no penalty for wrong or unanswered questions). There are two types of questions on
the test. The first presents a picture of an angle and the
participant chooses the correct answer of the angle (in
degrees) from among four response options. The second
presents a measure in degrees and the participant chooses
the angle (among four response options) that represents
that measure. For each worker requirement listed below, enter the rating best describing the extent to which
this test and/or its subtests measure that particular
worker requirement.
Results of the Survey

The surveys were returned, and the data were analyzed by project staff. Twenty-four respondents completed the background questionnaire, as well as all or
portions of the six tests they were to link with the
respective WRs. Nineteen of the 24 respondents classified themselves as Industrial/Organizational Psychologists; all but one of the 24 had obtained at least a masters
degree. In general, results of the questionnaire indicated
that the raters were experienced in test construction and
administration and were familiar with the AT-SAT test
battery.
All 12 respondents who were asked to rate Version 1
of the linkage survey completed and returned their
ratings. Two of these individuals volunteered to complete the linkage ratings of the Version 2 tests and
followed through by completing these ratings as well.
Completed ratings were returned by 11 of the 13
respondents who were tasked with making linkage
ratings for Version 2 tests, with one rater choosing not
to rate either the Memory or the Planes tests. Hence, 12
complete sets of linkage ratings were obtained for the
Version 1 tests, and 14 complete sets of linkage ratings
were obtained for all but two of the Version 2 tests
(Table 2.9).
5= This test measures this worker requirement to a

very great extent
considerable extent
moderate extent
limited extent
slight extent
0= This test does not measure this worker requirement
Linkage Results
Mean linkage scores between tests and WRs were
computed, representing a summary of the extent to
which the raters felt each test measured each WR. It was
16
#:328
decided that linkage means greater than or equal to 3

suggested that the raters felt a test was able to measure
the WR to at least a moderate extent. Therefore, a
criterion cutoff mean > 3 was established to determine
if a test was indeed able to successfully measure a
particular WR. Table 2.10 presents the following information: (a) a full list of the WRs (rank ordered by their
importance for doing the job), (b) those tests rated as
measuring the WR to at least a moderate extent (based
upon the mean > 3 cutoff), and (c) the mean linkage
rating corresponding to that test/WR pair.
The rank-ordered listing of WRs used in Table 2.10
was derived from the SACHA job analysis and consisted
of the ARTCC controllers ratings of the extent to
which the WR was seen as being important for doing the
job. Hence, on the basis of ARTCC controllers rating
from SACHA, prioritization was seen as the most
important WR for doing the job. AT-SAT project staff
chose to use the ARTCC rank-ordered listing of WRs
on their importance for doing the job for three reasons.
First, the ARTCC list was chosen over the list generated
by the ratings from all ATCSs because the latter included the ratings of controllers from the Flight Service
option. SACHA had clearly stated that the job analysis
findings for the Flight Service option were different
enough from the ARTCC and Terminal options to
require its own predictor development. Second, the
ARTCC list was selected over the Terminal option list
because the AT-SAT validation effort was using controllers from the ARTCC option. Finally, the ARTCC
list for doing the job was chosen over the list for learning
the job because the focus of AT-SAT was on developing
a selection battery that would predict performance in
the jobnot necessarily in training.
It should be mentioned that no data on importance
for learning or doing the job existed for five of the
SACHA-generated information processing WRs (Confirmation, Encoding, Rule Inference, Rule Application,
and Learning). This was because the SACHA project
staff believed that controllers could not adequately
comprehend these WRs well enough to rate them.
Because of this lack of importance data, project staff placed
these WRs at the end of the list of WRs (see Table 2.10.).
Based upon these criteria for inclusion, it was found
that 14 of the 15 most important WRs (as rated by
ARTCC controllers in the SACHA job analysis) were
successfully measured by one or more of the tests of the
AT-SAT battery. Similarly, the mean linkage ratings
suggest that the vast majority of the more important
WRs were successfully measured by multiple tests.
The linkage survey results indicated that all important WRs were not successfully measured by the ATSAT battery. Four WRs (Oral Communication, Problem
Solving, Long-Term Memory, and Visualization) from
the top third of SACHAs rank-ordered list did not have
linkage means high enough to suggest that they were
being measured to at least a moderate extent. None of
the AT-SAT tests were specifically designed to measure
oral communication and, as a result, linkage means
between this WR and the tests were found to be at or
near zero. Problem Solving had mean linkage ratings
that approached our criterion for inclusion for the
Applied Math and the Letter Factory tests. Similarly, the
mean linkage ratings between the Memory test and
Long-Term Memory, and between the Letter Factory
test and Visualization also approached but failed to
meet the mean criterion score of 3.
Quality of Individual Tests in the AT-SAT Battery
Results of the linkage survey were also summarized to
enable project staff to gain insight into how well individual tests were measuring the most important WRs.
Based upon the criterion of mean linkage score > 3 for
demonstrating that a test successfully measures a particular WR, project staff determined the number of
WRs successfully measured by each test. This score
provided some indication of the utility of each test.
Project staff also computed two additional scores to
indicate the utility of each measure. Some WRs were
rated as being successfully measured by many tests, and
other WRs were measured by only one or two tests. Two
other indicators of the utility of a measure were developed: (a) the number of WRs a test measured that are
only measured by one (or fewer) other test(s), and (b) the
number of WRs that are not measured by any other test.
Scores based upon these criteria were computed for each
measure and are listed in Table 2.11.
In addition to the indicators of each tests utility, it
was felt that indicators of each tests utility and quality
in measuring WRs could also be computed. To provide
some indication of each tests quality, project staff again
utilized SACHA findings the ARTCC controller
ratings of the importance of each WR for doing the job.
Each WRs mean importance rating (from SACHA)
was multiplied by those WR/test linkage ratings meeting criteria. The product of these two scores (mean WR
importance for doing the job x mean linkage rating of
WR for a test) factored in not only how well the test was
capturing the WR but the importance of that WR as
well. The mean and sum of these products were com-
17
#:329
puted for each (Table 2.11.). The mean of the products

can be viewed as an indicator of the average quality of a
measure factoring in both how well the test was measuring the WR and the importance of the WR. The sum of
the products provides some indication of the overall
utility of the measure in that the more WRs a test
captures, the better it captures those WRs, and the more
important these WRs are for the doing the job, the
higher a tests total score on this factor.
Given that no data were collected in SACHA for five
of the WRs (Confirmation, Encoding, Rule Inference,
Rule Application, and Learning) on their importance
for doing the job, the mean importance score across all
the WRs was imputed for these five WRs. This was done
so that some indication of a tests ability to measure
these WRs could be computed and factored into its
overall quality and utility scores (Table 2.11.).
Results suggest that some tests - Letter Factory, AT
Scenarios, and to a lesser degree the Time Wall and
Analogies tests - measured numerous WRs, while the
remaining tests measured from one to three WRs. Some
tests, such as Applied Math and Analogies, measured

multiple WRs that were not measured by other tests,
while other tests (Letter Factory, Air Traffic Scenarios,
and Time Wall) measured many WRs but none uniquely.
It should be mentioned that one of the reasons the Letter
Factory, Air Traffic Scenarios, and Time Wall did not
uniquely capture any WRs was that there was so much
overlap in the WRs successfully measured by these three
testsespecially between the Letter Factory and the Air
Traffic Scenarios.
CONCLUSION
Based upon the results of the linkage survey, every
test within the AT-SAT battery appeared to be successfully measuring at least one WR, and many of the tests
were rated as measuring multiple WRs. While not every
WR was thought to be successfully measured by the ATSAT battery, the vast majority of the WRs considered
most important for doing the job was successfully
measured by one or more predictors from the battery.
18
#:330
CHAPTER 3.1
PREDICTOR DEVELOPMENT BACKGROUND
Douglas Quartetti, HumRRO
William Kieckhaefer, RGI, Inc.
Janis Houston, PDRI, Inc
The final test in the OPM Battery, the OKT, contained questions on air traffic phraseology and procedures. It was designed to provide credit for prior ATCS
experience. It has been reported that OKT scores correlated with many of the indices of training success (Boone,
1979; Buckley, OConnor, & Beebe, 1970; Manning et
al., 1989; Mies, Coleman, & Domenech, 1977).
The scores on the MCAT and the ABSR were combined with weights of .80 and .20 applied, respectively.
These scores were then transmuted to have a mean of 70
and maximum of 100. The passing score varied with
education and prior experience. Applicants who received
passing scores on the first two predictors could receive up
to 15 additional points from the OKT.
The second stage in the hiring process was the Academy Screen. Applicants who passed the OPM Battery
were sent to the FAA Academy for a 9-week screen,
which involved both selection and training (Manning,
1991a). Students spent the first 5 weeks learning aviation
and air traffic control concepts and the final 4 weeks
being tested on their ability to apply ATC principles in
non-radar simulation problems. Applicants could still be
denied positions after the 9 weeks on the basis of their
scores during this phase. The reported failure rate was 40
percent (Cooper et al., 1994).
This hiring process received much criticism, despite
its reported effectiveness and links to job performance.
The criticisms revolved around the time (9 weeks for the
Academy screen) and cost of such a screening device
($10,000 per applicant). In addition to the FAA investment, applicants made a substantial investment, and the
possibility remained that after the 9 weeks an applicant
could be denied a position. Finally, there was concern
that the combination of screening and training reduced
training effectiveness and made it impossible to tailor
training needs to individual students.
As a result of these criticisms, the FAA separated
selection and training, with the idea that the training
atmosphere of the Academy Screen would be more
supportive and oriented toward development of ATCSs
Following the air traffic controller strike and the

subsequent firing of a significant portion of that workforce
in 1981, the Federal Aviation Administration was forced
to hire en masse to ensure safety of the airways. Cooper,
Blair, and Schemmer (1994) reported on the selection
procedures used after the strike. Their work is summarized below.
SELECTION PROCEDURES PRIOR TO

AT-SAT
The OPM Battery
In October 1981, the FAA introduced a two-stage
process for selecting Air Traffic Control Specialists
(ATCSs). The first stage was a paper-and-pencil aptitude
test battery administered by the Office of Personnel
Management (OPM), known as the OPM Battery. This
battery consisted of three tests: the Multiplex Controller
Aptitude Test (MCAT), the Abstract Reasoning Test
(ABSR), and the Occupational Knowledge Test (OKT).
The second stage was called the Academy Screen.
The first test of the OPM Battery, the MCAT, simulated aspects of air traffic control. Applicants were required to solve time, distance, and speed problems, plus
interpret tabular and graphical information to identify
potential conflicts between aircraft. Extensive research at
the Civil Aeromedical Institute (CAMI) indicated that
the MCAT scores were significantly correlated with
performance during the Academy Screen and later field
status (Manning, Della Rocco, & Bryant, 1989; Rock,
Dailey, Ozur, Boone, & Pickerel, 1978).
The second test of the OPM Battery, the ABSR, was
developed by the U.S. Civil Service Commission to
examine the abstract relationships between symbols and
letters. Research indicated a relationship between scores
on this test and the Academy Screen training performance (Boone, 1979; Rock et al., 1978).
19
#:331
once it was separated from selection. This necessitated

developing a new selection device to replace the Academy Screen.
Time Wall; Mean Correct Reaction Time from Pattern

Recognition; Stroop Mean Reaction Time for Conflict
Stimuli from the Stroop Color-Word Test; and Visual
Search Mean Correct Reaction Time from Visual
Search). These scores were retained based on their
single-order correlation with the criterion, their
intercorrelations with other predictor scores, and the
multiprocessing nature of the paired test scores (e.g., Air
Traffic Safety and Delay).
Multiple regression analyses showed that the Safety
score from Air Traffic Scenario and the Percent Correct
and Correct Reaction Time scores from the Static
Vector test had significant power in predicting the
Academy Screen Comprehensive Test score. The betas
for the remaining subtest scores were not significant in
the context of the other tests. ASI (1991) reported the
regression model shown in Tables 3.1.1 and 3.1.2.
The Pre-Training Screen was intended to be used as
a secondary screening procedure. Incremental validity
was estimated for the OPM battery score, and for the
OPM and PTS scores where the OPM score was entered
in step 1 and the PTS scores were entered in a block at
step 2. The OPM score alone produced an R = .226, R
square = .05. The model using the OPM and PTS scores
produced a multiple correlation of R = .505, R square =
.26. The difference of variance accounted for by the
addition of the PTS (.26 vs. .05) was significant (F =
24.18, p <.01). This indicated that the Pre-Training
Screen added significantly to the prediction of the
Academy Screen Comprehensive Test score, over and
above the OPM battery alone.
The second validation attempt (Weltin, Broach,
Goldbach, & ODonnell, 1992) was a concurrent criterion-related study using a composite measure of on-thejob training performance. Scores obtained from the Air
Traffic Scenario Test, Static Vector, Continuous
Memory, Stroop Color-Word Test, and Letter Rotation Test correlated significantly with the criterion.
Using two weighting schemes, the regression-based
weighting scheme yielded a correlation of .21, whereas
the unit weighting yielded a correlation of .18.
Use of the PTS as a screening device was discontinued in February 1994. The defensibility of the PTS was
questioned since it was validated only against training
performance criteria. The perception was that the test
security of the OPM test, in use since 1981 without
revision, had been compromised. Further, several coaching schools provided guarantees to students that they
The Pre-Training Screen

The FAA introduced the Pre-Training Screen (PTS)
in June 1992 to replace the second stage of the hiring
process, the Academy Screen. The PTS was developed
from a cognitive psychology perspective by Aerospace
Sciences, Inc. (ASI) in 1991. It was computer administered and consisted of two parts: the Complex Cognitive
Battery and the Air Traffic Scenario Test. For complete
descriptions of the components of the PTS and the
critical aptitudes covered by these tests, the reader is
referred to ASI (1991).
The first part of the PTS, the Complex Cognitive
Battery, included five test components: Static Vector/
Continuous Memory, Time Wall/Pattern Recognition,
Visual Search, Stroop Color-Word Test, and Letter
Rotation Test. According to ASI (1991), the Static
Vector/Continuous Memory Test was a multimeasure
test designed to assess the critical aptitudes of spatial
relations, working memory, verbal/numerical coding,
attention switching, and visualization. The Time Wall/
Pattern Recognition test was designed to assess filtering,
movement detection, prioritizing, short-term memory,
image/pattern recognition, and spatial scanning. The
Visual Search test measured short-term memory and
perceptual speed. The Stroop Color-Word Test assessed
the critical aptitudes of decoding, filtering, and shortterm memory. Finally, the Letter Rotation Test assessed
the critical aptitudes of decoding, image/pattern recognition, and visualization. It should be noted that each
test in this battery could yield multiple scores.
The second part of the PTS, the Air Traffic Scenario
Test, was a low-fidelity work sample test (Broach &
Brecht-Clark, 1994). Applicants were given a synthetic,
simplified air space to control. This test was designed to
assess nearly all of the critical aptitudes of the ATCS job.
Two attempts were made to validate the PTS. The
first (ASI, 1991) correlated PTS performance with
training criteria (the Academy Screen Comprehensive
Test score). Based on correlation analyses, the full set of
test scores was reduced to ten (Safety and Delay scores
from Air Traffic Scenario; Percent Correct and Mean
Correct Reaction Time from Static Vector; Percent
Correct and Mean Correct Reaction Time from Continuous Memory; Mean Absolute Time Error from
20
#:332
would pass the OPM battery. For a more complete

discussion of prior programs used to screen ATCS
candidates before AT-SAT, see Chapter 6 of this report.
Table 3.1.5 displays the construct categories, worker

requirements under the higher order construct of Processing Operations, and the tests that Schemmer et al.
hypothesized would assess the worker requirements.
Table 3.1.5 reveals that, for the construct labeled
Metacognitive, no tests had been recommended. In
addition, Schemmer et al. did not account for Sustained
Attention, Timesharing, Scanning, or Movement Detection worker requirements.
Finally, a Temperament/Interpersonal Model was
proposed to provide coverage of the worker requirements that did not fit into the Cognitive Model (Table
3.1.6.).
As noted earlier, due to a compromised OPM battery
and the elimination of use of PTS, the FAA decided to
support the development and validation of a new test
battery against job performance criteria. With this decision, a contract was awarded to Caliber Associates, and
support for the AT-SAT project was initiated.
Separation and Control Hiring Assessment

(SACHA)
In September 1991, the FAA awarded a contract to
University Research Corporation for the development
and validation of a new test battery for selection of
ATCSs. (The outcomes of the SACHA job analysis were
covered in more detail in Chapter 2.) By 1996, a
comprehensive job analysis was completed on four
ATCS options, and the construction of possible predictor tests had begun. The FAA terminated the SACHA
contract late in 1996.
A meta-analytic study of much of the previous validation research on the ATCS job was performed as part
of SACHA (Schemmer et al., 1996). This study reported
on predictors ranging from traditional cognitive ability
tests, and personal characteristics instruments, to air traffic
control simulations and psychomotor ability measures.
The validity studies are summarized in Table 3.1.3.
As reported by Schemmer et al. (1996), for most of
the predictor measure categories, the validity coefficients exhibited substantially greater variability than
would be expected under the simple explanation of
sampling error. This suggests that, in general, some
specific predictor measures are relatively more predictive of job performance than others. For example,
simulations and math tests have historically been good
predictors of controller job performance.
On the basis of the SACHA job analysis, Schemmer
et al. (1996) proposed an overall model of the ATCS
worker requirements that included a Cognitive Model
and a Temperament/Interpersonal Model. The Cognitive Model contained two higher-order constructs, g
and Processing Operations. Table 3.1.4 displays the
construct categories, worker requirements under the
higher order construct of g, and the tests purported to
measure the worker requirements. Schemmer et al.
recommended at least one test per worker requirement.
As Table 3.1.4 shows, there were some worker requirements for which the project still had not developed tests.
For example, their predictor battery did not account for
any of the requirements under the rubric of Communication. Additionally, much of the Applied Reasoning
construct remained untested, and Numeric Ability
(Multiplication/Division), Scanning, and Movement
Detection were not addressed.
AIR TRAFFIC SELECTION AND

TRAINING (AT-SAT) PROJECT
One of the challenges facing the AT-SAT research
team was to decide what SACHA-generated materials
would be adequate for the new battery and how many
new tests needed to be developed. This section describes
the procedures undertaken to review existing SACHA
materials and documents the evaluation of and comments on the batterys coverage of the predictor space.
Recommendations for the AT-SAT project were made,
based on the review process.
Test by Test Evaluation
A panel of nine individuals was asked to review each
test currently available on computer for possible inclusion in the air traffic control predictor test battery.
Evaluation sheets were provided for each test, requesting information about the following criteria:
(1) Does the test measure the worker requirement(s) it
purports to measure?
(2) Is it a tried-and-true method of assessing the worker
requirement(s)?
(3) Does the scoring process support the measurement
of the worker requirement(s)?
(4) Is the time allocation/emphasis appropriate?
(5) Is the reading level consistent with job requirements?
21
#:333
(6) Does the test have potential adverse impact?

(7) Is the test construction ready for validation administration?
Short descriptions of each test were provided for this
evaluation, along with test information such as number
of items, and scoring procedures. The worker requirement definitions used throughout this evaluation process were those listed for the Revised Consolidated
Worker Requirements on pages 115-119 of the SACHA
Final Job Analysis Report (January 1995). Sixteen tests
were independently reviewed by the panel members.
The results of the independent reviews were discussed at
a 3-day meeting. An additional four tests (Letter Factory
and the three PTS tests) were reviewed in a similar
fashion during the meeting. The 20 tests reviewed were:
Sound
Scan
Angles
Map
Dial Reading
Headings
Projection
Memory 1 and 2
Direction and Distance
Planes
ent subsets of tests covered the predictor domain. The

list of worker requirements, rank ordered by incumbent
importance ratings, as described in Chapter 2, was used
to help determine whether different tests or subsets of
tests covered the critical job requirements. These investigations and recommendations are summarized below.
Nine tests that received a preponderance of Yes
ratings were measuring critically important job requirements and appeared to be relatively non-overlapping.
These were Scan, Letter Factory, Sound, Dial Reading,
PEAQ, Analogy, Air Traffic Scenario (ATS), Time
Wall/Pattern Recognition (TW), and Static Vector/
Continuous Memory (SV). These nine tests were recommended for inclusion in the predictor battery. All
required modifications before they were deemed ready
for administration. Examples of the recommended
modifications follow.
Stix
Time
Syllogism
Analogy
Classification
Personal Experiences and Attitude
Questionnaire (PEAQ)
Letter Factory
Air Traffic Scenario (from PTS)
Time Wall/Pattern Recognition
(from PTS) Static Vector/Continuous Memory (from PTS)
Scan: Increase clarity of figures, increase number of

test items, and possibly use mouse to decrease keyboard
skills requirement.
Letter Factory: Increase planning/thinking ahead requirement (e.g., by adding boxes at top of columns).
Sound: Investigate possibility of changing scoring to
allow partial credit for partially correct answers.
Dial Reading: Increase number of items, decrease
time limit, investigate fineness of differentiation required.
PEAQ: Decrease number of items (focus on only
critically important worker requirements), replace random response items, edit response options for all items.
Analogy: Delete information processing component,
possibly add some of the Classification test items.
ATS, TW, SV: Separate individual tests from PTS
administration, shorten tests.
The project staff met with the panel members on 57 November, l996 to discuss the predictor battery. For
each test, independent ratings on each evaluation criterion were collected, and the relative merits and problems of including that test in the predictor battery were
discussed. The comments were summarized and recorded.
After the group discussion, panel members were
asked to provide independent evaluations on whether or
not each test should be included in the predictor battery.
For each test, panel members indicated Yes for inclusion, No for exclusion, and Maybe for possible
inclusion. The Yes-No-Maybe ratings were tallied and
summarized.
Three additional tests were strongly considered for

inclusion, but with further modifications: Planes, Projection (perhaps modified to represent above-ground
stimuli), and a numerical ability test. The latter represented a worker requirement that was otherwise not
measured by the set of included tests. The plan for the
numerical ability test was initially to include items
modified from several existing tests: Headings, Direction and Distance, and Time, all of which include
components of on-the-job numerical computation.
Angles and Dials would be added to round out the
numeric ability construct.
Selection of a Subset of Tests

The next step involved selecting a subset of the 20
tests for inclusion in the predictor battery. Considerations included both the Yes-No-Maybe ratings (based
on multiple, test-specific criteria), and how well differ-
22
#:334
AT-SAT ALPHA BATTERY
Possible Gaps in Coverage

Viewing the 12 tests on the preliminary list as a
whole, the coverage of worker requirements appeared
quite good. However, a few important worker requirements remained unrepresented: reading comprehension, memory, and street physics. There was some
discussion about including measures of these three
requirements. A reading test could be prepared, using
very short, face valid passages, where the passage and the
test question could be displayed on screen at the same
time. Discussions about adding a memory test primarily
focused on a modification of the Map test, which would
require candidates to indicate whether the stimulus had
changed or remained the same since they memorized
it. The possibility of finding a published test measuring
street physics was also discussed. If such a published test
could not be found, some kind of mechanical or abstract
reasoning test might be included as a close approximation.
Based on the reviews and recommendations of the

expert panel, the AT-SAT researchers developed the
predictor battery to be pilot tested, called the Alpha
Battery. It consisted of 14 tests given across five testing
blocks. They were:
Block A:
Block B:
Block C:

Sound test and Letter Factory test
Dials test, Static Vector/Continuous Memory
test, and Experiences Questionnaire (formerly
PEAQ)
Block D: Time Wall/Pattern Recognition test, Analogy
test, and Classification test
Block E: Word Memory test, Scan test, Planes test, Angles
test, and Applied Mathematics test
A short description of the tests follows. In a few

instances, details reflect modifications made in the
alpha pilot tests for use in the beta (validation) testing.
Excluded Tests
Three of the 20 tests reviewed were deleted from
further consideration: Stix, Map (except as it might be
revised to cover memory), and Syllogism. These tests
were deleted because of problems with test construction, and/or questionable relevance for important job
requirements, and/or redundancy with the included
measures.
Air Traffic Scenarios Test

This is a low-fidelity simulation of an air traffic
control radar screen that is updated every 7 seconds. The
goal is to maintain separation and control of varying
numbers of simulated aircraft (represented as data blocks)
within the participants designated airspace as efficiently as possible. Simulated aircraft either pass through
the airspace or land at one of two airports within the
airspace. Each aircraft indicates its present heading,
speed, and altitude via its data block. There are eight
different headings representing 45-degree increments,
three different speed levels (slow, moderate, fast), and
four different altitude levels (1=lowest and 4=highest).
Separation and control are achieved by communicating and coordinating with each aircraft. This is accomplished by using the computer mouse to click on the
data block representing each aircraft and providing
instructions such as heading, speed, or altitude. New
aircraft in the participants airspace have data blocks
appear in white that turn green once the participant has
communicated with them. Rules for handling aircraft
are as follows: (1) maintain a designated separation
distance between planes, (2) land designated aircraft at
their proper airport and in the proper landing direction
flying at the lowest altitude and lowest speed, (3) route
aircraft passing through the airspace to their designated
exit at the highest altitude and highest speed. The
Additional Recommendations
Several additional recommendations were made concerning the predictor battery and its documentation.
The first was that all tests, once revised, be carefully
reviewed to ensure that the battery adheres to good test
construction principles such as consistency of directions and keyboard use, reading/vocabulary level, and
balancing keyed response options.
A second recommendation was that linkages be provided for worker requirements that do not currently
have documented linkages with ATCS job duties. The
current documentation (from the Job Analysis report)
was incomplete in this regard.
A third recommendation was to pilot test the
predictor set in February l997. It was thought that this
would yield the kind of data needed to perform a final
revision of all predictors, select the best test items,
shorten tests, reduce redundancy across tests, ensure
clarity of instructions, and so on.
23
#:335
version of ATST that was incorporated in the alpha

battery was modified to operate in the windows environment (Broach, 1996).
A plane has flown for 3 hours with a ground speed of

210 knots. How far did the plane travel?
These questions require the participant be able to

factor in such things as time and distance to identify the
correct answer from among the four answer choices.
Analogy Test
The Analogy test measures the participants ability to
apply the correct rules to solve a given problem. An
analogy item provides a pair of either words or figures
that are related to one another in a particular way. In the
analogy test, a participant has to choose the item that
completes a second pair in such a way that the relationship of the items (words or figures) in the second pair is
the same as that of the first.
The test has 57 items: 30 word analogies and 27
visual analogies. Each item has five answer options. The
scoring is based primarily on the number of correct
answers and secondarily on the speed with which the
participant arrived at each answer. Visual analogies can
contain either pictures or figures. The instructions
inform the participant that the relationships for these
two types of visual analogies are different. Picture analogies are based on the relationships formed by the meaning of the object pair (e.g., relationships of behavior,
function, or features). Figure analogies are based on the
relationships formed by the structure of the object pair
(e.g., similar parts or rotation).
Dials Test
The Dials test is designed to test the participants
ability to quickly identify and accurately read certain
dials on an instrument panel. The test consists of 20
items completed over a total time of 9 minutes. Individual items are self-paced against the display of time
left in the test as a whole. Participants are advised to skip
difficult items and come back to them at the end of the
test. The score is based on the number of items answered
correctly. The test screen consists of seven dials in two
rows, a layout which remains constant throughout the
test. Each of the seven dials contains unique flight
information. The top row contains the following dials:
Voltmeter, RPM, Fuel-air Ratio, and Altitude. The
bottom row contains the Amperes, Temperature, and
Airspeed dials.
Each test item asks a question about one dial. To
complete each item, the participant is instructed to (1)
find the specified scale on the instrument panel; (2)
determine the point on the scale represented by the
needle; (3) find the corresponding value among the five
answer options; (4) use the numeric keypad to press the
number corresponding to the option.
Angles Test
The Angles test measures the participants ability to
recognize angles. This test contains 30 multiple-choice
questions and allows participants up to 8 minutes to
complete them. The score is based on the number of
correct answers (with no penalty for wrong or unanswered questions). There are two types of questions.
The first presents a picture of an angle, and the participant chooses the correct answer of the angle (in degrees)
from among four response options. The second presents
a measure in degrees, and the participant chooses the
angle (among four response options) that represents that
measure.
Experiences Questionnaire
The Experiences Questionnaire assesses whether participants possess certain work-related attributes by asking questions about past experiences. There are 201
items to be completed in a 40-minute time frame. Items
cover attitudes toward work relationships, rules, decision-making, initiative, ability to focus, flexibility, selfawareness, work cycles, work habits, reaction to pressure,
attention to detail, and other related topics. Each question is written as a statement about the participants past
experience and the participant is asked to indicate their level
of agreement with each statement on the following 5-point
scale: 1= Definitely true, 2= Somewhat true, 3= Neither
true nor false, 4= Somewhat false, 5= Definitely false.
Applied Mathematics Test

This test contains 30 multiple-choice questions and
allows participants up to 21 minutes to complete them.
The score is based on the number of correct answers
(with no penalty for wrong or unanswered questions).
The test presents five practice questions before the test
begins. Questions such as the following are contained
on the test:
Letter Factory Test

This test simulates a factory assembly line that manufactures letters A to D of the alphabet. Examinees
perform multiple and often concurrent tasks during the
24
#:336
test with aid of a mouse. Tasks include: (1) picking up

letters of various colors from a conveyor belt and loading
them into boxes of the same color; (2) moving empty
boxes from storage to the loading area; (3) ordering new
boxes when supplies become low; (4) calling Quality
Control when defective letters appear; and (5) answering multiple-choice questions about the factory floor
display. The test is comprised of 18 test parts; each part
begins when the letters appear at the top of the belts and
ends with four multiple-choice questions. Awareness
questions assess the state of the screen display. Easier
questions are presented during lulls in assembly line
activity and assess the current state of the display. More
difficult questions are asked during peak activity and
assess what a future display might look like.
Overall scores on the LFT are based on (1) the
number of boxes correctly moved to the loading area;
(2) the time it takes to move a box after it is needed; (3)
the number of letters correctly placed into boxes; and (4)
answers to the awareness questions. The following
actions lower test scores: (1) allowing letters to fall off
the end of a belt; (2) placing letters in an incorrect box;
(3) not moving a box into the loading area when needed;
and (4) attempting to move the wrong box into the
loading area.
same tasks as in Part 2, but the statements below the

planes are a little more difficult to analyze. In all other
respects, the participants perform in the same manner.
Scan Test
In the Scan test, participants monitor a field that
contains discrete objects (called data blocks) which are
moving in different directions. Data blocks appear in
the field at random, travel in a straight line for a short
time, then disappear. During the test, the participant
sees a blue field that fills the screen, except for a 1-inch
white bar at the bottom. In this field, up to 12 green data
blocks may be present. The data blocks each contain two
lines of letters and numbers separated by a horizontal
line. The upper line is the identifier and begins with a
letter followed by a 2-digit number. The lower line
contains a 3-digit number. Participants are scored on
the speed with which they notice and respond to the data
blocks that have a number on the lower line outside a
specified range. Throughout the test, this range is displayed at the bottom of the screen (e.g., 360-710). To
respond to a data block, the participant types the 2digit number from the upper line of the block (ignoring
the letter that precedes it), then presses enter.
Sound Memory Test
The Sound Memory test measures a participants
listening comprehension, memory, and hand-eye coordination. Participants must hear, remember, and record
strings of numbers varying in length from 5 to 10 digits.
After the digits have been read, there is a brief pause.
Then a yellow box will appear on screen, and participants must type in the numbers they heard and remembered, in the order presented orally. Participants may
use the backspace to delete and correct the numbers they
enter, and press the enter key to submit the answer.
Each participants score equals the total number of
digits the participant remembers correctly. If the participant transposes two digits then half-credit is given.
Items must be answered in the order presentedparticipants cannot skip and return to previous items. If too
few digits are typed then the missing digits are scored as
incorrect; if too many digits are typed then the extra
digits are ignored. The object is simply to recall digits
heard in the correct order.
Planes Test
The Planes test contains three parts, each with 48
items to be completed in 6 minutes. Each individual
item must be answered within 12 seconds. Part 1:
Participants perform a single task. Two planes move
across a screen; one plane is red, the other is white. Each
plane moves toward a destination (a vertical line) at a
different speed. The planes disappear before they reach
their destinations, and the participant must determine
which plane would have reached its destination first. To
answer each item, the participant presses the red key
if the red plane would have reached the destination first,
and the white key if the white plane would have
arrived first. Participants can answer while the planes are
still moving, or shortly after they disappear. Part 2: Part
2 is similar to Part 1, but participants must now perform
two tasks at the same time. In this part of the test,
participants determine which of two planes will arrive at
the destination first. Below the planes, a sentence will
appear stating which plane will arrive first. The participant must compare the sentence to their perception of
the planes arrival, and press the true key to indicate
agreement with the statement, or the false key to
indicate disagreement. Part 3: Participants perform the
Time Wall/Pattern Recognition Test

The Time Wall/Pattern Recognition test consists of
two tasks that measure the examinees ability to judge
the speed of objects and to compare visual patterns at the
25
#:337
same time. In the time judgment task, the participant

watches a square move from left to right and estimates
when it will hit a wall positioned on the right side of the
display screen. In the pattern comparison task, the
participant determines whether two patterns are the
same or different from each other. Each exercise begins
with a square moving toward a wall at a fast, medium,
or slow speed. After a short while, the square disappears
behind the pattern recognition screen. The participant
must hit the stop key at the exact moment the square hits
the wall.
In the pattern comparison task, the participant is
shown two blue circles, each with an overlay pattern of
white dots. Test takers are requested to press the same
key if the patterns are the same or press the differ key
if the patterns are different. Concurrently, participants
should press the stop key when they think the square
will hit the wall, even if they are in the middle of
comparing two patterns. Participants are scored upon
how quickly they respond without making mistakes.
The score is lowered for each incorrect judgment.
then recall these at two different testing times: one

immediately following a practice session and another in
a subsequent testing block. The practice session lasts 4
minutes, during which the list of 24 SPRON words and
their English equivalents are displayed in a box to the
right of the display screen while the multiple-choice
items are displayed on the left. The practice items allow
the test takers to apply their memory by allowing them
to review the SPRON-English list of words as a reference. The first testing session starts immediately following the practice session and lasts 5 minutes. The second
testing session starts in a subsequent testing block (after
a break time) and also lasts for 5 minutes. Each multiplechoice item displays the SPRON word as the item stem
and displays five different English equivalents as the five
response alternatives.
CONCLUSION
The initial AT-SAT test battery (Alpha) was professionally developed after a careful consideration of multiple factors. These included an examination of the
SACHA job analysis and prior job analyses that produced lists of worker requirements, prior validation
research on the ATCS job, and the professional judgment of a knowledgeable and experienced team of
testing experts.
Word Memory Test

The Word Memory test presents a series of 24 words
in an artificial language (i.e., SPRON) and their
associated English equivalents. The goal is to memorize
the 24 SPRON words and their English equivalents and
26
#:338
CHAPTER 3.2
AIR TRAFFIC - SELECTION
AND
TRAINING ALPHA PILOT TRIAL AFTER-A CTION REPORT
Claudette Archambault, Robyn Harris

Caliber Associates
INTRODUCTION
AT-SAT Pilot Test Description

The AT-SAT Pilot Test is a series of five test blocks
(Blocks A through E) and Ending Blocks. (There are
four different Ending Blocks.) The tests are designed to
measure different aptitudes required for successfully
performing the job of air traffic controller. Tests are
subdivided as follows:
The purpose of this report is to document the observations of the Air Traffic - Selection and Training
Completion (AT-SAT) predictor battery (alpha version) pilot trial. The AT-SAT predictor battery is a series
of tests in five blocks (A through E) of 90 minutes each
and four different ending blocks of 20 minutes each.
The pilot test was administered February 19 through
March 2, 1997, in the Air Traffic Control School at the
Pensacola Naval Air Station in Pensacola, Florida. Participants consisted of 566 students stationed at the
Naval Air Technical Training Center (NATTC). Of the
566 students, 215 of the participants were currently
enrolled in the Air Traffic Control School and 346 were
students waiting for their classes at NATTC to begin.
(The status of five participants was unknown.)
Block A contains one test entitled Air Traffic Scenarios (ATS).

Block B contains the Sound Test and the Letter
Factory Test (LFT).
Block C contains the Dials Test, Static Vector/Continuous Memory Test (SVCM), and Experiences Questionnaire.
Block D contains the Time Wall/Pattern Recognition Test (TWPR), the Analogy Test, and the Classification Test.
Block E contains the Word Memory Test, the Scan
Test, the Planes Test, the Angles Test, and the Applied
Mathematics Test.
This report contains the following sections:
Pilot Test Description and Procedures

General Observations
Feedback on Test Block A
Feedback on Test Block B
Feedback on Test Block C
Feedback on Test Block D
Feedback on Test Block E
Feedback on the Ending Block
Depending on the Participants group number, the

Ending Block consisted of one of the following.
The report concludes with a summary of all of the

feedback and observations.
the LFT
the ATS
the SVCM and Word Memory tests
the Word Memory and TWPR tests
The following section describes the test administration procedures including the sequence of the testing
blocks for groups of participants.
THE AT-SAT PILOT TEST DESCRIPTION AND

ADMINISTRATION PROCEDURES
Pilot Test Administration Procedures
The following sections describe the AT-SAT pilot

test and pilot test administration procedures.
Participants were arranged in five groups of ten

(Groups 1 through 5). Test Administrators (TAs) supplied the testing rooms with 55 computers. Fifty of the
computers were used for testing stations; five were
27
#:339
failure-safe or recovery stations. Recovery stations were

reserved for use by participants when TAs were not able
to restore operation to a malfunctioning computer.
In one classroom, there were 33 computers (30 for
testing and three failure-safe computers): Groups 1, 2,
and 3 were tested on computers 1 through 30 (See
Exhibit 3.2.1). In a second classroom, there were 22
computers (20 for testing and two failure-safe computers). Groups 4 and 5 were tested in the second room on
computer numbers 31 through 50. Exhibit 3.2.1 displays the sequencing of test blocks. (The exhibit does
not reflect breaks.)
Participants were offered a minimum of a ten-minute
break between each of the five testing sections. Because
the tests are self-paced, participants were not required to
take the 10-minute breaks between blocks. They were
required to take a 1.5 hour meal break between sessions
two and three.
the names of the tests and a short description of each

test, the aptitudes that are being tested, and the time
allotted for each test should be added. This screen may
eliminate discrepancies where participants are unclear
as to whether to continue with the other tests in the
block when they reach the end of a test.
Test Ending
The end of each test should include a brief statement
in a text box stating that the participant has completed
the current test and should press enter, or click on the
continue button (with the mouse pointer) to proceed to
the next test in the block. The text box could also state
the number of tests completed and the number of tests
that remain for each block.
Currently, some blocks do not indicate the end of the
block with a text box. Some tests simply go to a blue
screen and do not indicate that the test has indeed
ended. The final test in a block should indicate not only
that the current test is finished but also that the participant has indeed completed all tests within the block and
that they should raise their hand to speak with the Test
Administrator.
Not all of the tests indicate the results of the tests
and/or practice sessions. For consistency, either all
tests should display results, or all tests should not
display results.
GENERAL OBSERVATIONS
This section presents some general observations about
the entire AT-SAT Battery Pilot Test. The remarks in
this section address the instructions, the test ending, the
purpose of the tests, and the introductory block.
Instructions
Instructions for several of the tests in the battery need
improved clarity. Participants often did not understand
the test instructions as written but proceeded with the
tests, anticipating that the objective of the tests would
become more clear as the tests proceeded. Too often,
however, participants still did not understand the objective even after attempting a few examples. (After participants completed the examples, they would often raise
their hand and ask for further instructions.) Therefore,
any practice sessions for the tests did not clarify the
confusing instructions. The test instructions that need
revision and feedback for specific test blocks are discussed in the following sections.
Introductory Block
The addition of an Introductory Block (IB) is recommended. The IB could explain of the general purpose of
the testing a modified version of the Keyboard Familiarization section and the current Background Information questions.
The explanation of the general purpose of the test
might also include a brief description of the evolution of
the test (how the FAA came to design this specific testing
procedure). This section could describe the types of tests
and the general purpose of the tests (i.e., ability to multitask, ability to follow instructions, skill with plane
routing procedures, etc.). Finally, general grading/scoring procedures could be explained with more specific
explanations within each of the tests.
The Keyboard Familiarization (KF) currently includes instruction and practice for the number keys and
the A, B, C keys (after the Test Administrator exchanges the slash, star, and minus keys with the A, B,
and C keys) on the numeric pad on the right side of the
keyboard. Instructions should be modified to include
the names of the tests requiring the use of these keys.
Purpose of the Tests

Participants also required further clarification of the
purpose of tests within the blocks during the practice
session (instead of before or after the test). Perhaps a
short paragraph including the aptitudes that are being
tested would clarify the purpose of certain tests.
In addition to more specific test instructions, an
introductory screen at the start of each block, to include
the number of different tests within the specific block,
28
#:340
Directions under KF should also include the names

of the tests that will require the use of the numerical keys
on the top of the keyboard. A practice session should
also be included for these keys to allow participants to
become acquainted with the placement of their hands at
the top of the keyboard.
Background information questions should be included within the Introductory Block. This will allow
participants to practice using the keyboard outside of a
testing block. It will also allow them to ask the Test
Administrator questions about the test, use of the keyboard, the placement of hands on the keyboard, and so on.
New Planes
New planes that appear in the participants airspace
are white (while all other planes are green). The white
planes remain circling at the point where they entered
the airspace until they receive acknowledgment from
the controller (by clicking on the graphic with the
mouse pointer). Often during testing participants did
not understand the purpose of the white planes in their
airspace. They would leave the white planes circling and
never manipulate their heading, speed, or level. White
planes need to be more clearly defined as new planes in
the controllers airspace that require acknowledgment
by the controller.
FEEDBACK ON TEST BLOCK A

Countdown
At the start of a scenario, participants often did not
notice the countdown (on the counter at the bottom
right-hand corner of the screen) before the beginning of
a test. There is a delay (of approximately 7 seconds)
between the time the test initially appears on the screen
and the time the participant can perform an action to the
planes on the screen.
During this delay, some participants continuously
pushed the enter button, which would often result in
one of two consequences: (1) The computer screen
would permanently freeze (such that the system would
need to be rebooted); (2) at the end of the test, the
participant received the plain blue screen (indicating
that the test was complete). However, once the Test
Administrator closed-out the blue screen and returned
to the program manager, there would remain a row of
several icons with each icon indicating an air traffic
scenario. The Test Administrator would need to presume that the next correct test in the sequence was the
first in the row of icons and double click on that icon to
begin a scenario. At the end of each scenario, the Test
Administrator would double-click on the next scenario
in the row of icons until all scenarios were complete.
For participants to clearly see that there is a delay
before they can manipulate the planes on the screen,
perhaps the countdown timer can be moved to a more
conspicuous position in the middle of the screen (as in
the Static Vector/Continuous Memory Test). An alternative would be to display the counter in a brightly
colored text box (still in the bottom right-hand corner
of the screen). After the countdown timer had finished,
the text box could change colors and blend with the
other instructions on the screen.
Block A was the only block that consisted of only one

test. Therefore, the comment below applies to the Air
Traffic Scenarios Test (ATST).
The ATST requires participants to manipulate the
heading, speed, and level (or altitude) of planes in their
airspace. On the testing screen, participants see the
airspace for which they are responsible, two airports for
landing planes, and four exits for routing planes out of
the airspace. The screen also displays the controls for
directing planes: (1) the heading (to manipulate individual plane direction), (2) the speed (slow, medium, or
fast), (3)
the level (1, 2, 3, 4). Finally, a landing heading
indicator is displayed that informs the participant of the
direction to land planes at each of the airports.
Instructions
Instructions for the ATST may need more clarification.
Often, participants required further clarification on:
the meaning of each of the plane descriptors that
appear on the screen
the difference between white and green planes
the need to click on the graphic (depicted as an arrow)
that represents the plane (versus the text descriptors of
the plane) to change the heading, level, and speed.
Instructions need to be rewritten to include more
details on the descriptors accompanying the planes.
Perhaps in the instructions section, the descriptors can
be enlarged on the screen with an arrow pointing to the
definition of the letters and number as in Exhibit 3.2.2.
29
#:341
Landing Heading Indicator

Participants often did not notice the landing heading
indicator located on the bottom right-hand corner of
the screen. Others noticed the arrow but did not understand its purpose. Further instruction on the location
and the purpose of the landing heading indicator may be
necessary. Perhaps during the practice scenario, a text
box can flash on the screen pointing out when the
participant has landed a plane in the incorrect direction.
The same idea may be useful to point out other participant errors during the practice session(s).
would be moved to a failure-safe computer. It is likely

that such failures are the result of the hardware or
hardware configuration, rather than the software.
There is another possible reason for the failure of the
Sound Test. Some participants would attempt to repeatedly adjust the volume of their headsets with the
numbers on the top of the keyboard rather than using
the number keys on the right-hand side of the keyboard
(as instructed). It is possible that the use of these keys
caused some of the failures.
Removal of Headphones
Upon completion of the Sound Test, participants
often keep the headphones on their ears throughout the
second test in Block B. The addition of some text at the
end of the test to instruct participants to remove their
headphones might be useful.
FEEDBACK ON TEST BLOCK B

This section details the observations for the two tests
in Block B: the Sound Test and the Letter Factory Test.
Specific comments about each test are provided below.
Letter Factory Test

This test measures four abilities required to perform
air traffic controller jobs. These abilities are: (1) planning and deciding what action to take in a given
situation through the application of specific rules; (2)
thinking ahead to avoid problems before they occur; (3)
continuing to work after being interrupted; and (4)
maintaining awareness of the work setting.
Sound Test
For this test, the participant uses headphones to listen
to a sequence of numbers. Then the participant must
repeat the sequence of the numbers heard using the
right-hand numeric keypad to record the sequence of
numbers.
Failures
It was found in the first day of testing that computers
would lock or fail if the Sound Test was run after any
other blocks. In other words, unless Block B was first in
the sequence of testing, computers would fail (at the
moment participants are prompted for sound level
adjustment) and need to be rebooted. This proved
disruptive to other participants and delayed the start of
the test (since Test Administrators can only aid one or
two participants at a time). To prevent failures during
the testing, Test Administrators would reboot every
computer before the start of Block B. Still, the software
would sometimes fail at the moment the participant is
requested to adjust the volume to their headphones via
the keyboard (versus the sound level adjustment located
directly on the headphones). On occasion, the Sound
test would still fail, but after again rebooting the computer, the program recovered.
After several attempts to restore the program where
there were repeated failures, the computer still did not
allow the participant to continue with the test. In these
cases where a computer failed repeatedly, participants
Test Instruction
The test instructions are clear and well-written. Few
participants had questions in reference to the tasks they
were to perform once the test began.
Demonstration
Participants were often confused during the demonstration because the pointer would move when they
moved the mouse, but they could not click and
manipulate the screen. Participants would ask Test
Administrators if they had already begun the test since
they could move the pointer. Perhaps the mouse can be
completely disabled during the demonstration to eliminate confusion. Disabling the mouse would allow participants to concentrate on the instructions since they
would not be distracted by movement of the mouse.
Mouse Practice Instructions
Instructions for the mouse practice session are not
clear. The objective of the mouse practice is for the
participant to click on the red box in the middle of the
30
#:342
screen and then click on the conveyer belt that illuminates. Participants are often unsure of the objective.
Perhaps text box instructions can be displayed on the
screen that direct the participant to click on the red box.
As the participant clicks on the red box, another instruction screen would appear, telling the participant to click
on the illuminated conveyer belt. After a few sequences
with text box instruction, the instructions could be
dropped.
Some participants had difficulty completing the
mouse practice session. They continuously received
messages instructing them to ...move the mouse faster
and in a straight line. Perhaps there should be a limit to
the number of mouse practice exercises. It is possible
that some participants are not capable of moving the
mouse quickly enough to get through this section.
They often think they should be comparing the two

numbers that are on the screen at that moment, rather
than comparing the top number of the current screen to
the bottom number of the previous screen.
The example for determining the conflict for the
Static Vector questions is not clear. The rule about the
planes requiring 2000 feet difference was confusing
because they did not understand that, although the
altitude is actually displayed in hundreds of feet, the
altitude represents thousands of feet.
Keyboard Issues
Many participants attempted to use the numerical
keys on the right-hand side of the keyboard to answer
the items rather than the using the keys on the top of the
keyboard as instructed. When participants use the righthand keypad, their answers are not recorded. The keys
to be used for this test need to be stated more explicitly.
Participants may be using the right-hand keypad
because of the instruction they receive in the Keyboard
Familiarization (KF) section at the beginning of the
testing. The current version of the KF only provides
instruction for use of the keypads on the right-hand side
of the keyboard. The KF does not instruct participants
on the use of the numerical keys on the top of the
keyboard.
As noted previously, the KF needs to be modified to
include instructions on the use of the keys on the top of
the keyboard. For data to be properly collected, it is
critical for participants to use the top keys.
FEEDBACK ON TEST BLOCK C

This section details the observations and suggestions
for the three tests in Block C: the Dial Test, Static
Vector/Continuous Memory Test, and Experiences
Questionnaire. Specific comments about each test are
provided below.
Dial Test
This measures the participants ability to quickly and
accurately read dials on an instrument panel. Participants did not appear to have difficulties with this test.
Test Administrators rarely received questions from participants about this test.
The Experiences Questionnaire determines whether
the participant possesses work-related attributes needed
to be an air traffic controller. Participants generally did
not ask any questions about the Experiences Questionnaire. The occasional inquiry was in reference to the
purpose of certain questions. Test Administrators did
not receive questions about the wording of the items.
Static Vector/Continuous Memory Test

This measures the participants ability to perform
perceptual and memory tasks at the same time. The
perceptual task involves determining whether two planes
are in conflict. The memory task involves remembering
flight numbers. On each trial, the screen displays a plane
conflict problem on the left side of the screen and a
memory problem on the right side of the screen. An
attention director indicates which problem the participant is to work on and is located in the middle at the
bottom of the screen.
FEEDBACK ON BLOCK D
for the three tests in Block D: the Time Wall/Pattern
Recognition Test; the Analogy Test; and the Classification Test. Specific comments about each test are provided below.
Instructions
Participants do not understand the instructions for
the Memory part of the test. Numerous participants
asked for clarity on what numbers they were to compare.
31
#:343

This test measures the participants ability to judge
time and motion and make perceptual judgments at the
same time. The time judgment task involves watching
a ball move (from the far left-hand side of the screen)
and estimating when it will hit a wall (located on the far
right of the screen). The pace of the ball is different for
every scenario. The perceptual task involves determining whether two patterns are the same or different.
These tasks must be performed concurrently by the
participant. The following paragraphs provide observations and suggestions for this test. This section includes
observations and suggestions for improving the Time
Wall/Pattern Recognition test in Block D.
Analogy Test
The Analogy Test measures the participants reasoning ability in applying the correct rules to solve a given
problem. The participant is asked to determine the
relationship of the words or pictures in set A and use this
relationship to complete an analogy in set B. The
following paragraph provides observations and suggestions for this test.
Level of Difficulty
The vocabulary level and the types of relationships
depicted in the Analogy Test may have been too difficult
for the pilot test participants. Perhaps the questions can
be revised to require a lower level of vocabulary and
reasoning skills for the participants.
Location of the Broken Wall

When the participant does not stop the ball from
hitting the wall in a timely manner, the screen displays
a broken wall. However, the broken wall appears in the
middle of the screen, rather than on the right-hand side
of the screen. In reference to this, participants often
asked how they were to determine when the ball would
hit the wall if the wall was always moving. Test Administrators had to explain that the wall did not move, but
that once the ball broke through the wall, the screen
displayed the distance past the wall the ball had moved.
To eliminate confusion, perhaps the broken wall can
remain on the right-hand side of the screen and just
appear broken rather than being moved to the center of
the screen.
Classification Test
This also measures the participants reasoning ability
in applying the correct rules to solve a given problem.
The Classification Test is similar to the Analogy Test,
except that the participant is required to determine the
relationship of three words or pictures and use this
relationship to complete the series with a fourth word or
picture. The following paragraph provides observations
and suggestions for the improvement of this test.
Level of Difficulty
Similar to the issues discussed with the Analogy Test,
many of the items in the Classification Test appeared to
be difficult for the pilot test population. The Classification Test could be revised to allow a lower level of
vocabulary and reasoning skills.
Keyboard
As with the Static Vector/Continuous Memory Test,
many participants attempted to use the numerical keys
on the right-hand side of the keyboard to answer the
items rather than using the keys on the top of the
keyboard as instructed. When participants use the righthand keypad, their answers are not recorded. The keys
to be used for this test need to be stated more explicitly.
Participants may be using the keypad because of the
instruction they receive in the Keyboard Familiarization
(KF) section at the beginning of the first block of testing.
The current version of the KF only provides instruction
for using the keypad. The KF does not instruct participants on the use of the numerical keys on the top of the
keyboard.
FEEDBACK ON TEST BLOCK E

for the five tests in Block E. Specific comments about
each test are provided below.
Word Memory Test
The Word Memory Test requires the participant to
remember the English equivalents for words in an artificial
language called Spron. The following paragraphs provide observations and suggestions for this test.
32
#:344
Level of Difficulty
The majority of participants appeared to understand
how to respond to this test. The practice session for this
test seemed to work well in preparing participants for
the actual test questions.
that appears at the bottom of the screen during the test

to Press Enter to record this selection. Third, participants did not know whether the instructions to identify
numbers outside the range were inclusive of the
numbers at the top and bottom of the range. This issue
should be explicitly stated in the test instructions.
Erroneous Text Boxes

Several text boxes appear during this test that should
be removed for future versions of the Word Memory
Test. The test provides the participant with a text box at
the end of the test that displays a total score. This is
inconsistent with many of the other tests in the AT-SAT
Battery that provide no final scores to the participants.
Also, when the test begins, a text box appears, which
prompts the participant to Press Enter to begin.
Once the participant presses enter, another text box
appears that prompts the participant to Please be sure
Num Lock is engaged. Because these text boxes are
irrelevant, the software should eliminate this message in
future versions.
Computer Keyboards
Since the directions instructed the participants to
respond as quickly as possible, in their haste, many
participants were pressing the numeric keys very hard.
The banging on the keyboard was much louder with this
test than with any of the other tests; this affect the
longevity of the numeric keys when this test is repeated
numerous times.
Planes Test
The Planes Test measures the participants ability to
perform different tasks at the same time. The Planes
Test consists of three parts. In Part one, the participant
uses the 1 and the 3 keys to determine whether the
red plane (1) or the white plane (3), which are at varying
distances from their destinations, will reach its destination first. In Part two, the participant uses the 1 and
the 3 keys to determine if a statement about the red
and white planes as they are in motion is true (3) or false
(1). In Part three, the participant uses the 1 and the 3
keys to determine if a statement about the arrival of the
red and white planes at their destination are true (3) or
false (1), but unlike in Part two, the planes are at varying
distances from their destinations. The following paragraphs provide observations and suggestions for the
improvement of this test.
Scan Test
The Scan Test measures a participants ability to
promptly notice relevant information that is continuously moving on the computer screen. Participants are
provided with a number range and asked to type the
identifier for numbers that appear on the screen outside
of that range. A revised version of this test was installed
midway through the pilot test, which changed the
process for recording data but did not change the
appearance or the performance of the test for the participants. The following paragraphs provide observations
and suggestions for the improvement of this test.
Instructions
While the instructions for the test seemed clear,
participants had some common misunderstandings with
the test instructions. First, participants typed the actual
numbers which were outside of the number range
instead of the identifier numbers. This confusion might
be alleviated by revising the text that appears on the
bottom of the screen during the test. It currently states,
Type the identifier numbers contained in the data
blocks with the lower line numbers falling beyond the
range. It could be revised to state, Type the identifier
numbers contained in the data blocks (following the
letter) with the lower line numbers falling beyond the
range. Second, participants did not know to push
Enter after typing the identification numbers. This
confusion might be alleviated by highlighting the text
Practice Sessions
The practice sessions preceding the first two parts of
the Planes Tests are somewhat lengthy. There are 24
practice items that the participant must complete before
the actual test of 96 items. If the number of practice
items were reduced by one half, the participants would
still have enough practice without becoming bored
before the actual test begins.
Level of Difficulty
Participants appeared to be challenged by the Planes
Test. One factor that added to the level of difficulty for
the participants was that the response keys for Parts two
and three of this test are: 1 = false and 3 = true. It was
more intuitive for many participants that 1 = true and
3 = false thus, they had a difficult time remembering
33
#:345
which keys to use for true and false. This might have
caused participants more difficulty than actually determining the correct answer to the statements. If the
labeling of the true and false response keys cannot be
modified in future software versions, a message box can
be created to remain on the screen at all times that
indicates 1 = false and 3 = true.
calculate an answer that they ran out of time and were

not able to complete this test. Perhaps the level of
difficulty of the applied mathematics questions can be
reduced.
FEEDBACK ON THE ENDING BLOCK

for the four retests included in the Ending Block.
Specific comments about each ending test block is
provided below.
Test Results
Once the participant provides a response to an item
on the Planes Test, a results screen appears indicating
whether the response was right or wrong. This is
inconsistent with many of the other tests in the AT-SAT
Battery that do not indicate how a participant performs
on individual test items, in addition to further lengthening an already lengthy test.
Letter Factory Re-Test

Participants in Group One (computers 1-10) and
Group Five (computers 41-50) completed a re-test of
the Letter Factory as their Ending Block. This version of
the Letter Factory Test does not provide the participant
with any test instructions or opportunities to practice
before beginning the test. However, participants appeared to have little difficulty remembering the instructions for this test from Block B.
Angles Test
This measures a participants ability to recognize
angles and perform calculations on those angles. The
following paragraph provides observations and suggestions for this test.
Air Traffic Scenarios Re-Test

Participants in Group Two during the pilot test
(computers 11-20) completed a re-test of the Air Traffic
Scenarios as their Ending Block. This re-test allows the
participant to review the instructions before beginning
the abbreviated-length version of the Air Traffic Scenarios. The proposed revisions to the Air Traffic Scenarios Test in Section 4 of this report also apply to this
version of the test in the Ending Block.
Level of Difficulty
Participants appeared to be challenged by this test,
although it seemed as if they could either very quickly
determine a response about the measure of an angle, or
it took them some time to determine their response.
Applied Mathematics Test
This measures the participants ability to apply mathematics to solve problems involving the traveling speed,
time, and distance of aircraft. The following paragraphs
provide observations and suggestions for the improvement of this test.
Static Vector/Continuous Memory and Word

Memory Re-Test
Participants in Group Three (computers 21-30) completed a re-test of the Static Vector/Continuous Memory
Test and the Word Memory Test as their Ending Block.
The re-test of the Static Vector/Continuous Memory
Test allows the participant to review the instructions
but does not provide a practice session before the actual
test begins. The proposed revisions to the Static Vector/
Continuous Memory Test in Section 6.2 of this report
and to the Word Memory Test in Section 8.1 of this
report also apply to these versions of the tests in the
Ending Block.
Instructions
A sentence should be included in the instructions
that no pencils, paper, or calculators may be used during
this test. Many pilot test participants assumed that these
instruments were allowed for this portion of the test.
Level of Difficulty
Many participants appeared to have difficulty determining the best answer to these mathematical questions. Several participants spent so much time trying to
34
#:346
Word Memory and Time Wall/Pattern Recognition

Re-Test
Participants in Group Four (computers 31-40) completed a re-test of Word Memory and the Time Wall/
Pattern Recognition as their Ending Block. The re-test
of the Time Wall/Pattern Recognition Re-Test allows
the participant to review the instructions and complete
a practice session before beginning the test. The proposed revisions to the Word Memory Test in Section
8.1 of this report and the Time Wall/Pattern Recognition Test in Section 7.1 of this report also apply to these
versions of the tests in the Ending Block.
changes to the entire battery of tests. The majority of the

recommended changes are intended to enhance the
clarity of test instructions, increase the value of the test
practice sessions, and revise some of the questions for
the ability level of the participants. Exhibit 3.2.3, on the
following page, displays a summary of the proposed
revisions to the pilot test software.
The information provided by the Test Administrators was one of the information sources used to revise the
test battery. A significant effort on the part of the project
team went into revising the instructions for the tests and
the other changes recommended by the Test Administrators. The next section discusses the psychometric
information used to revise the battery. Both sources of
information provided the test developers the information necessary to build the Beta Battery, which was used
in the concurrent validation study.
SUMMARY OF THE FEEDBACK ON THE

AT-SAT PILOT TEST BATTERY
This section of the report summarizes the feedback
on all the test blocks within the AT-SAT Pilot Test
Battery. Overall, we are recommending relatively few
35
#:347
CHAPTER 3.3
A NALYSIS AND REVISIONS OF THE AT-SAT PILOT TEST
Douglas Quartetti and Gordon Waugh, HumRRO
Jamen G. Graves, Norman M. Abrahams, and William Kieckhaefer, RGI, Inc
Janis Houston, PDRI, Inc
Lauress Wise, HumRRO
This chapter outlines the rationale used in revising
the tests and is based on the pilot test data gathered prior
to the validation study. A full description of the samples
used in the pilot study can be found in Chapter 3.2. It
is important to note that some of the tests were developed specifically for use in the AT-SAT validation
study, and therefore it was imperative that they be pilottested for length, difficulty, and clarity. There were two
levels of analysis performed on the pilot test data. First,
logic and rationale were developed for the elimination
of data from further consideration in the analyses. After
the were elimination process, an item analysis of each
test was used to determine the revisions to tests and
items that were needed.
Exclusionary decision rules were based on available
information, which varied from test to test. For example, in some instances, item latency (time) information was available as the appropriate method for
exclusion; in other cases, the timing of the tests were
computer driven and other criteria for exclusion were
developed. An item was considered a candidate for deletion if it exhibited any of the following characteristics:
items (N=392) of the Applied Math test (AM). Examining

the average latency in seconds for the items revealed a mean
time of 14.7 and a standard deviation of 10. After review of
the actual test items, it was decided that any individual
spending less than 4 per item was probably responding
randomly or inappropriately. Review of a scatter plot of
average latency by percentage correct revealed that those
individuals taking less than 5 scored at the extreme low end,
about half scoring below chance. To corroborate this
information, a comparison of scores on the Applied Math
test and scores on the ASVAB Arithmetic Reasoning test
(AS_AR) identified individuals who had the mathematical
ability but were not motivated to perform on the Applied
Math test (i.e., high ASVAB score but low AM score).
Based on this information, three guidelines for eliminating individuals were formulated:
(1) High Omits: It was determined that any individual
attempting fewer than 35 items AND receiving a percent correct score of less than 39 percent was not
making a valid effort on this test.
(2) Random Responders: After reviewing and comparing the percent correct scores for the Applied Math test
and the AS_AR scores, it was determined that any
individual whose AS_AR was greater than 63, but
whose percent correct was less than 23%, was not
putting forth an honest effort on this test.
(3) Individuals whose average latency was less than 4
were excluded from further item analysis.
Low Discrimination: The item did not discriminate

between those individuals who received high versus low
total scores, stated as a biserial correlation.
Check Option: One or more incorrect response options had positive biserial correlations with total test
score.
Too Hard: The percent correct was low.
Too Easy: The percent correct was high.
High Omits: The item was skipped or not reached,
with these two problems being distinguishable from
each other.
Application of these exclusion rules further reduced

the sample size to 358 for the item analysis.
Item Analysis
On the Applied Math test, all of the items that
were characterized as High Omits were items that the
participants did not reach because of test length, not
items that they merely skipped. Additionally, this test
has four response options with each item, and therefore the chance level of a correct response is 25%.
Applied Math Test

Case Elimination
To determine reasonable average and total latencies
for the items attempted, the original sample of 435 was
restricted to those individuals who completed all 53
37
#:348
After review of the item analysis, 18 items were

deleted, reducing the test length to 30 items. The item
analysis printout for the deleted items can be found in
Appendix A. An extensive review of the items by content
and computation type was conducted to ensure fair
representation of relevant item types. The item types
represented were Computing Distances, Computing
Travel Time, Computation Given Multiple Point Distances, Computing Ascending/Descending Rates, and
Computing Speed.
Summary and Recommendations

The 13 items that had low discrimination or response
options that were chosen more often than the correct
response were eliminated, reducing the test length to 44
items. An additional recommendation was that 17 inch display monitors be used in the beta version to
ensure the integrity of the graphics.
Angles Test
Case Elimination
For the Angles test, all 445 individuals completed the
entire test (30 items). A scatter plot was created of the
average latency per item in seconds by the percent
correct for attempted items. The mean and standard
deviation for average latency were 8.2 and 2.67, respectively. The mean and standard deviation for percent
correct were 67.73% and 17.7%, respectively. A grid
based on the means and standard deviations of each axis
revealed that, of the four individuals who were more
than two standard deviations below the mean for average latency (2.86 per item), three scored more than two
standard deviations below the mean for percentage
correct (32.33%). The other individual was about 1.5
standard deviations below the mean for percent correct.
It appears that these individuals were not taking the time
to read the items or put forth their best effort. By
eliminating those individuals with an average item
latency of less than 2.86, the item analysis sample was
reduced to 441.

The test was shortened from 53 items to 30. Textual
changes were made to four items for clarification. The
items were re-ordered with the five easiest items first,
then the rest of the 30 items randomly distributed
throughout the test. This ensured that the test taker
would reach at least some of the most difficult items.
Dials Test
Case Elimination
For the Dials test, 406 of the 449 participants completed the entire test. A scatter plot of average latency per
item in seconds by percent correct for attempted items
was created for the reduced sample (406). The mean and
standard deviation for average latency were 12.47 and
4.11, respectively. The mean and standard deviation for
percentage correct were 78.89% and 12.96%, respectively. A grid overlay based on the means and standard
deviations revealed that individuals who were more
than two standard deviations below the mean for average latency (4.25 per item) were scoring more than two
standard deviations below the mean for percent correct
(52.97%). It appears that these individuals were not
taking the time to read the items or put forth their best
effort. Following an exclusion rule of eliminating participants who had an average latency per item of 4.25 or
less, the sample was reduced from 449 to 441.
Item Analysis
The item analysis did not reveal any problem items
and there appeared to be a good distribution of item
difficulties. No text changes were indicated. After reviewing the item analysis and the items in the test, none
of the items were deleted.
This test appears to function as it was intended.
There were no item deletions and no textual changes.
Item Analysis
After review of the item analysis and of specific
items, 13 items were deleted from the original test.
All had low discrimination and/or another response
option that was chosen more frequently than the
correct response. In many instances, the graphics
made it difficult to discriminate between correct and
incorrect dial readings. The revised test consists of 44
items. The item analysis printout for the deleted
items can be found in Appendix A.
Sound Test
Case Elimination
On the Sound test, 437 participants completed 17 or
18 items. Of the remaining five participants, one completed only two items (got none correct) and was deleted
from the sample. The other four participants made it to
the fourth set of numbers (8 digits). All the scores of this
group of four were within one standard deviation (15%)
38
#:349
of the mean of the percentage correct for attempted

items (35.6%). Additionally, five other participants did
not get any items correct. It was determined that two of
them were not trying, and they were deleted. The
remaining three seemed to be in the ballpark with
their responses (i.e., many of their responses were almost
correct). With the exclusion of three participants, the
total sample for the item analysis was 439.
Memory Test
Case Elimination
A scatter plot of Memory test items answered by
percent correct revealed a sharp decline in the percent
correct when participants answered fewer than 14 items.
It was decided that participants who answered fewer
than 14 items were not making an honest effort on this
test. Additionally, it was felt that participants who
scored less than 5% correct (about 1 of 24 correct)
probably did not put forth their best effort, and therefore, they were removed from the item analyses. These
two criteria eliminated 14 participants, leaving a sample
of 435 for the item analyses.
Alternative Scoring Procedure

The Sound test consists of numbers of set lengths (5,
6, 7, 8, 9, 10 digits) being read and then participants
recalling them. There are three trials associated with
each number length (i.e., number length 5 has three
trials, number length 6 has three trials, etc.) for a total
of 18 items. Examinees receive a point for every item
answered correctly. An alternative scoring procedure
would be to award a point for each digit they get correct
and one point for a digit reversal error. For example, in
the 5-digit case, a correct response may be 12345, but a
participant may answer 12354 (digit reversal). In this
case, the participant would receive 3 points for the first
three digits and 1 point for the digit reversal, for a total
of 4 points on that trial. This scoring procedure was
examined as an alternative to the number correct score.
Item Analysis
After review of the item analysis, none of the items
were removed. Item 1 had low discrimination, low
percent correct, and a high number of omits. However,
there were no such problems with the remaining items,
and given that these are non-sense syllables, one explanation may attribute the poor results to first-item nervousness-acclimation. All items were retained for the
beta version, and no editorial changes were made.
This test performed as expected and had a normal
distribution of scores. One item had problem characteristics, but a likely explanation may be that it was the first
item on the test. The recommendation was to leave all
24 items as they were but to re-examine the suspect item
after beta testing. If the beta test revealed a similar
pattern, then the item should be examined more closely.
Item Analysis
After review of the item analysis, none of the items
were removed. However, the biserial correlations of the
items from digit length 5 and digit length 10 were
appreciably lower than the rest of the items. The reliability of this test with the original scoring procedure was
.70, while the alternative scoring procedure improved
reliability to .77. Using the alternative scoring procedure, in a comparison of the original version and a
revised version with digit length 5 and digit length 10
removed, the revised version had a slightly higher reliability (.78).
Analogy Test
Case Elimination
For the Analogy test, cases were eliminated based on
three criteria: missing data, pattern responding, and
apparent lack of participant motivation.
Missing Data. The test software did not permit
participants to skip items in this test, but several (12.8%)
did not complete the test in the allotted time, resulting
in missing data for these cases. Those missing 20% or
more of the data (i.e., cases missing data for 11 items or
more) were omitted. Five cases were eliminated from the
sample.
Pattern Responders: The chance level of responding
for this test was 20%. An examination of those participants near chance performance revealed one case where
the responses appeared to be patterned or inappropriate.

Since digit lengths of 5 and 10 had lower biserial
correlations than the rest of the items, it was recommended that the number of trials associated with these
items be reduced to two each. The alternative scoring
procedure, based on the number of within-item digits
correct with partial credit for digit reversals, was recommended for the beta version.
39
#:350
Unmotivated Participants: Identifying participants

who appeared to be unmotivated was based on the
average latency per item, which was 5.4. It was determined, because of the complexity of the items, that
participants spending 5.4 or less were not taking the test
seriously or were randomly responding, and therefore
were eliminated from the item analyses. As a cross check,
an examination of the percentage correct for those
participants whose average latency was 5.4 seconds or
less showed that their scores were near chance levels.
Four participants were eliminated.
In summary, 10 participants were eliminated from
further analyses, reducing the sample size from 449 to
439.
Testing Time
Based on the sample of 439 participants, 95% of the
participants completed the test and instructions in 33
minutes (Table 3.3.5). Table 3.3.6 shows time estimates
for two different levels of reliability.
Test Revisions
A content analysis of the test revealed four possible
combinations of semantic/non-semantic and word/visual item types. The types of relationships between the
word items could be (a) semantic (word-semantic), (b)
based on a combination of specific letters (word - nonsemantic), (c) phonetic (word - non-semantic), and (d)
based on the number of syllables (word - non-semantic).
The types of relationships between visual items could be
based on (a) object behavior (visual-semantic), (b) object function (visual-semantic), (c) object feature (visual-semantic, (d) adding/deleting parts of the figures
(visual - non-semantic), (e) moving parts of the figures
(visual - non-semantic), and (f) rotating the figures
(visual - non-semantic).
After categorizing the items based on item type, an
examination of the item difficulty level, item-total
correlations, the zero-order intercorrelations between
all items, and the actual item content revealed only one
perceptible pattern. Six non-semantic word items were
removed due to low item-total correlations, five being
syllable items (i.e., the correct solution to the analogy
was based on number of syllables).
Seven more items were removed from the alpha
Analogy test version due to either very high or low
difficulty level, or to having poor distractor items.
Word Items. The time allocated to the Analogy test
items remained approximately the same (35 minutes
and 10 minutes for reading instructions) from the alpha
version to the beta version. The number of word items
did not increase; however, nine items were replaced with
items that had similar characteristics of other wellperforming word items. There were equal numbers of
semantic and non-semantic items (15 items each).
Since the analogy items based on the number of
syllables performed poorly, this type of item was not
used when replacing the non-semantic word items.
Instead, the five non-semantic word items were replaced
with combinations of specific letters and phonetic items.
Additionally, three semantic items were replaced with
three new semantic items of more reasonable (expected)
difficulty levels.
Scale and Item Analyses

An examination of the biserial correlations for the 53
items revealed 12 items that had biserial correlations of
.10 or less. This reduced the number of items within
three of the four test scales as follows (the original
number of items appears in parentheses): Non-Semantic Words 9 (15), Semantic Words 12 (15), and Semantic Visuals 7 (10). Tables 3.3.1 to 3.3.4 present these
corrected item-total correlations and the alphas for the
items within each scale. As Table 3.3.4 indicates, all 13
items within the Non-Semantic Visual scale met the
criterion for retention. After omitting items based on
the above criteria, the number of items in this test
dropped from 53 to 41.
Construct Validity
A multitrait-multimethod matrix was constructed
to assess whether the information processing scores and
the number-correct scores measure different constructs
or traits. Test scores based on two traits (i.e., Information Processing and Reasoning) and four methods (i.e.,
Word Semantic, Word Non-Semantic, Visual Semantic, and Visual Non-Semantic) were examined. The
results provided the following median correlations:
The median convergent validity (i.e., same trait, different
method) for information processing scores was .49.
The median convergent validity for number-correct scores
was .34,
The median divergent validity (i.e., different trait, different method) was .18.
These preliminary results suggest keeping separate
the information-processing and number-correct scores
for the Analogy test, pending further investigation.
40
#:351
Visual Items. Since the non-semantic picture items

demonstrated a relatively stable alpha (.67) and high
item-total correlations, no items were removed. In an
effort to stabilize the alpha further, three non-semantic
picture items were added, increasing the non-semantic
visual subtest from 13 to 16 items.
One item was dropped because it contained poor
distractors. Two other semantic visual items that appeared to have poor distractors were modified to improve the clarity of the items (without lowering the
difficulty level). In addition, two newly created items
were added to this scale. Thus, one item was replaced
with two new items, and two others were modified.
Instructional Changes. Based on feedback from site
Test Administrators, portions of the Analogy test instructions were simplified to reduce the required reading level of the text. Also, the response mode was
changed from use of a keyboard to use of a mouse. The
Viewing an Item section of the instructions was revised
accordingly.
Pattern Responding. From examination of the pattern of responses of participants who scored at or near
chance levels (20%), eight participants were identified
as responding randomly and were eliminated.
Unmotivated Participants. It was decided that participants spending less than 3 per item were not making
a serious effort. Four participants fell into this category.
An examination of their total scores revealed that they
scored at or near chance levels, and thus they were
eliminated from further analyses.
In summary, 22 participants were eliminated from
further analyses, reducing the sample size for this test
from 449 to 427.
Scale Reliabilities and Item Analyses
Reliability analyses were conducted to identify the items
within each of the four test scales that did not contribute to
the internal consistency of that scale. The corrected itemtotal correlation was computed for each item within a
scale, as well as the overall alpha for that scale.
An examination of the item-total correlations revealed that the Non-Semantic Word scale items had an
average correlation of .179, and therefore the entire scale
was omitted from further analyses. This reduced the
number of items within the three remaining test scales
as follows (the original number of items appears in
parentheses): Semantic Word 9 (11), Non-Semantic
Visual 10 (13), and Semantic Visual 3 (10). Note that
the greatest number of items were removed from the
semantic visual scale. Tables 3.3.7 to 3.3.10 present the
corrected item-total correlations for the items within
each scale. After omitting items based on the above
criteria, the number of items in this test was reduced
from 46 to 22.

The Analogy test assesses inductive reasoning and
information processing abilities in four areas: NonSemantic Word, Semantic Word, Non-Semantic Visual, and Semantic Visual. The number-correct scores
that reflected reasoning ability proved less reliable than
the information processing scores. Of the 53 items in
the alpha battery, 41 contributed sufficiently to test
reliability to warrant inclusion in the revised version. It
was estimated that to achieve a reliability level of 0.80 it
would be necessary to increase the test length to 150
items. Given time limits in the validation version, the
overall test length was limited to 57 items.
Classification Test
Construct Validity
In assessing the construct validity of the information
processing measures independent of the number correct
scores, a multitrait-multimethod matrix was constructed.
Two traits (i.e., information processing and reasoning)
and four methods (i.e., Word Semantic, Word NonSemantic, Visual Semantic, and Visual Non-Semantic)
were examined. The results of this analysis provided the
following median correlations:
Case Elimination Analyses

Cases in the Classification test were eliminated based
on three criteria: missing data, pattern responding, and
apparent lack of participant motivation.
Missing Data. As with the Analogy test, the Classification test software did not permit participants to skip
items. However, some participants (7.5%) did not
complete the test in the allotted time, resulting in
missing data. Of these cases, those missing 20% or more
of the data (i.e., cases missing data for nine items or
more) were omitted. A total of 10 cases were eliminated.
The median convergent validity (i.e., same trait, different

method) for information processing scores was .48.
The median convergent validity for number-correct
scores was .20.
41
#:352
The median divergent validity (i.e., different trait,

different method) was .09.
needed. During the test, there were 86 times when a

participant should have placed a box in the loading area.
The computer software recorded the number of times a
participant tried to place a box incorrectly (i.e., to place
a box when one was not needed or to place an incorrectly
colored box). This measure serves as an indicator of
inappropriate responding. Table 3.3.14 shows the distribution of the number of unnecessary attempts to
place a box in the loading area across the entire sample
of 441 cases.
Several participants had a very high number of these
erroneous mouse clicks. There are two possible reasons
for this. Feedback from the Test Administrators indicated that some participants were double-clicking the
mouse button, instead of clicking once, in order to
perform LFT test tasks. Every instance a participant
erroneously clicks the mouse button is recorded and
compiled by the computer to generate an inappropriate
response score. Thus, if a participant orders the correct
number of boxes (86) by double-clicking instead of
single-clicking, 86 points will be added to his or her
inappropriate response score.
A few participants had a random-response score higher
than 86. These participants may have developed a strategy
to increase their test score. The test instructions explained
the importance of placing a box in the loading area as soon
as one was needed. This may have caused some participants to continuously and consistently attempt to place
boxes in the loading area. Participants received an error
signal each time they unnecessarily attempted to place a
box; however, they may not have realized the negative
impact of this error on their test score.
Cases were eliminated where the inappropriate response score was higher than 86. This allowed using the
data from participants who were motivated but might
have misunderstood the proper way to use a mouse
during this test. However, to prevent an inappropriate
response strategy from interfering with a true measure of
Planning/Thinking Ahead (P/T), information from the
inappropriate response variable must be used when
calculating a participants P/T test score. Omitting cases
based on the above criteria reduced the sample from 441
to 405.
Item Analyses
Recall From Interruption (RI). The proposed measure of RI was a difference score between a participants
number-correct score across a set of items presented
immediately after an interruption and number-correct
score across a set of items presented just before the
interruption. Four of the test sequences (sequences 4, 6,
These preliminary results suggested a separation of

the information-processing and number-correct scores
for the Classification test, pending further investigation.
Time Limit Analyses
Based on the sample of 427 participants, 95% of the
participants completed the instructions and the 46 test
items in 22 minutes (Tables 3.3.11). Table 3.3.12
shows estimates of test time limits assuming two different levels of reliability and test lengths for the three test
parts. These estimates assume keeping all aspects of the
test the same (i.e., all four classification schemes).
Of the original 46 items, only three of the four scales
(i.e., Semantic Word, Semantic Visual, and Non-Semantic Visual) and a total of 22 items contributed
sufficiently to test reliability to warrant inclusion in a
revised test version. To construct a test having the same
three parts and increase the reliability to about .80 (for
number-correct scores), the number of items would
need to increase from the 22 to 139. It was further found
that the Classification test correlates highly with the
Analogy test. Given that the Classification test had
lower reliability scores than the Analogy test, it was
recommended that the Classification test be eliminated
from the AT-SAT battery.
Letter Factory Test
Analysis of Initial LFT
Case Elimination. Two criteria were used in eliminating Letter Factory Test participants: apparent lack of
participant motivation and inappropriate responding.
Unmotivated Participants. Unmotivated participants
were considered to be those who responded to very few
or none of the items. An examination of performance on
the number correct across all Planning/Thinking (P/T)
items in the test sequences (Table 3.3.13) reveals a gap
in the distribution at 28% correct. It was decided that
participants scoring below 28% were not making a
serious effort on this test, and they were eliminated from
further analysis.
Inappropriate Responding. Inappropriate responders were identified as participants who either selected a
box of the wrong color or selected boxes when none were
42
#:353
8, and 11) contained RI items. Table 3.3.15 shows the

number of items within each sequence that make up
each pre-interruption and post-interruption scale score,
as well as the score means and standard deviations. The
mean scores for each sequence are very high, indicating
a ceiling effect. However, increasing the difficulty of the
RI items would require increasing either the number of
letters on belts or the belt speed. Either of these methods
would alter the task so that psychomotor ability would
become a very significant component of the task. Therefore, it was concluded that this task is not suited to
providing a measure of RI.
Table 3.3.15 also provides a summary of the reliability of the pre-interruption, post-interruption, and difference scores. Notice that the reliability of the pre- and
post-interruption scores (Alpha = .79 and .73, respectively) is much higher than the reliability of the difference scores (Alpha = .10). Plans for the recall from
interruption score were abandoned due to low reliability.
Planning/Thinking Ahead. To prevent an inappropriate response strategy from interfering with a true
measure of P/T, the inappropriate response score must
be used in calculating a participants P/T test score. The
test design prevented the association of unnecessary
mouse clicking with a specific P/T item. (Participants
do not have to respond to P/T test items in the order in
which they receive them; instead, they may wait and
then make several consecutive mouse clicks to place
multiple boxes in the loading area.) However, the
software recorded the number of times each participant
inappropriately attempted to place a box during a test
sequence. Also, the number of unnecessary mouse clicks
has a low but significant negative correlation (r = -.20,
p < .001) with the number-correct scores on a sequence.
Therefore, the P/T scale score per sequence was computed by subtracting the number of unnecessary mouse
clicks in a sequence from the number-correct score
across all P/T items in that sequence.
Table 3.3.16 summarizes findings from the reliability analysis on the seven sequences designed to measure
P/T. The second column indicates the number of P/T
items involved in calculating the P/T sequence scores.
Notice that the sequence-total correlations are .60 or
higher. Therefore, none of the P/T sequences were
deleted. The alpha computed on the seven sequences
was .86.
Situational Awareness (SA). As noted earlier, the
Letter Factory Test contained multiple-choice questions designed to measure three levels of SA. Fourteen of
these items were designed to measure SA Level 1, 16
items to measure SA Level 2, and 14 items to measure SA

Level 3. Table 3.3.17 summarizes findings from the
analyses on the items within the scale for each of these
three levels. If an item revealed a corrected item-total
correlation (column 3) of less than .10, it was removed
from the scale. This reduced the number of Level 1,
Level 2, and Level 3 items to 8, 11, and 7, respectively.
A reliability analysis on the remaining SA items showed
alphas on the three new scales of .42, .61, and .47.
Next, a scale score was computed for each of the three
SA levels. These scores were used in a reliability analysis
to determine whether the three scales were independent
or could be combined into one scale that measured the
general construct of SA. The alpha computed on the
three scale scores was .53. The results indicated that
removal of any one of the three scale scores would not
increase the alpha. These results supported the notion
that all remaining SA items should be combined into
one scale.
Table 3.3.18 presents findings from a reliability
analysis on the 26 remaining SA items scored as one
scale. The alpha computed on this overall scale was .68.
The corrected item-total correlations computed in the
reliability analysis on the three separate SA scales (Table
3.3.17, After Item Deletion) were very similar to the
corresponding corrected item-total correlations computed for the combined scale (Table 3.3.18). This also
supports the notion of using a single number-correct
score across all remaining SA items.
Analysis of LFT Retest
The LFT Form B or retest contains five test sequences
that are parallel to the following sequences in the LFT
Form A: 3, 6, 7, 10, and 11. Form B contains 37 P/T
items, 54 RI items (27 pre-interruption items and 27
post-interruption items), and 20 SA items.
Case Elimination
First, we identified participants who had been classed
as unmotivated or random responders while taking
Form A of the LFT. Twenty-three of those participants
also received Form B of the LFT. Since Form B was
designed to serve as a retest, we eliminated the Form B
23 cases that had been eliminated from Form A.
Next, the same criteria used in Form A were used to
eliminate unmotivated participants and inappropriate
responders in Form B. To look for unmotivated participants, we considered performance on the number correct across all P/T items in the test sequences. Table
43
#:354
3.3.19 provides an overview of the distribution of

those number-correct scores across the entire sample
of 217 cases.
A natural gap was evident in the distribution and
cases where the number-correct score was lower than 30
were eliminated. Then, participants who were responding inappropriately to items were identified. During the
test, there were 37 times when a participant should have
placed a box in the loading area. Table 3.3.20 provides
an overview of the distribution of the number of inappropriate attempts to place a box in the loading area
across the entire sample of 217 cases. Cases where the
inappropriate response score was higher than 37 were
eliminated. After omitting cases based on the above
criteria, the sample size for this test was reduced from
217 to 184.
add that product to the mean), we estimate a slightly

higher amount of time (66.7 minutes) for the 95th
percentile participant. A test completion time of 67
minutes, then, seems appropriate for the test at its
current length. Of this, 95% of participants completed
the LFT test sequences in 27.1 minutes. This leaves
about 39.9 minutes for instructions and practice.
The Spearman-Brown formula was used to estimate
the number of items needed to raise the reliability of
Situational Awareness to .80 (49 items) and .90 (110
items). Because the measure of Planning and Thinking
Ahead already has a higher estimated reliability of .86,
it would automatically go up to a sufficient level when
the number of sequences is raised to increase the reliability of Situational Awareness. Table 3.3.23 presents a
recommended composition of sequence lengths and
number of Situational Awareness test items per sequence. It was estimated that this recommended composition would yield a test reliability in the low .90s for
Planning and Thinking Ahead, and a reliability in the
low .80s for Situational Awareness. Based on experience
with the alpha LFT, it was estimated that participants
would spend 45 minutes on the test portion of the new
test version.
The amount of practice before the test also needed to
be increased. The initial practice sequence was an easy
30-second sequence, followed by sequences of 2 minutes and 2.25 minutes. Adding three more sequences of
the same difficulty as the test, together with four SA
questions after each sequence, was proposed. One 30second sequence and two 2.5-minute sequences would
add an additional 7.5 minutes to the practice and
instruction time. Improving the instructions to emphasize the penalty that occurs for error clicks on the
Planning/Thinking Ahead measure would add about
half a minute. Instruction and practice time, then,
should increase by 8 minutes from 39.9 to about 48
minutes. With a 45-minute time for the test sequences,
this amounted to 93 minutes for the recommended beta
version of the LFT.
Item Elimination
Again, since Form B was designed to serve as a retest,
the findings from analyses performed on LFT Form A
were used to determine which test items to eliminate
from Form B. We removed 8 SA items so 12 SA items
remain in Form B. Similarly, a P/T score was computed
for Form B by subtracting the number of unnecessary
mouse clicks from the number-correct score across all P/
T items.
Performance Differences
Form B was used to assess whether participants had
reached an asymptote of performance during Form A.
Different sequences could not be used in Form A for this
test because the item types are very heterogeneous, and
little information is available on item or sequence
difficulties. By matching test sequences, we can control
for manageable aspects of the test that impact test
performance. Table 3.3.21 presents the results of dependent t-tests comparing two performance measures. Those
results show no support for a change in participants
performance on Situational Awareness. However, the
roughly 8% performance increment on Planning and
Thinking Ahead was a significant increase in performance. This suggests that participants would benefit
from more practice before beginning the test.
Test Revisions
Test Sequences. To reduce the number of inappropriate responses by participants who double-click or
continuously click on the box stack, the associated error
signal (one red arrow) was changed to an error message.
The error message appears in red above the box stack
when participants try to move a box to the loading area
when one is not needed. The new error message reads,
You did not need to move a box.
Time Limit Analyses

Table 3.3.22 presents the distribution of test completion times for the LFT, showing that 95% of participants completed the LFT in 64.9 minutes or less. When
we use the normal curve to compute the 95th percentile
(i.e., take 1.96 times the standard deviation (7.63) and
44
#:355
To increase visibility, the error signal was also changed

to an error message when participants try to place a letter
in a box that is not the fullest. This error message reads,
You did not place the letter in the fullest box.
Analyses described above showed that the mean
scores for RI items had a ceiling effect. Also, it was
indicated earlier that the difficulty level of the RI items
could not be increased without making psychomotor
ability a significant component of the task. For these
reasons, all RI items were removed from the test.
To increase test reliability for the P/T and SA measures, the number of test sequences and SA questions
was increased. Test sequences were increased from 11 to
18. Level 1 SA questions were increased from 14 to 26,
Level 2 SA questions from 16 to 24, and Level 3 SA
questions were increased from 14 to 26. SA questions
also were revised.
Test Instructions. Based on feedback from the project
team and the FAA, several sections of the instructions
were changed. The pilot version of the test included
several short practice sequences in the middle of the
instructions and three complete practice sequences at
the end of the instructions but before the test. To
increase the amount of practice, two practice sequences
were added in the middle of the instructions, allowing
participants to practice moving boxes to the loading area
and placing letters into boxes. In addition, the mouse
exercise appears before the practice sequences so participants can learn how to use the mouse before they
practice taking the test. In addition, the mouse exercise
was changed so participants can choose to receive additional practice using the mouse (up to nine trials), or
move to the next set of instructions.
Other changes in response to project team and FAA
feedback include (a) not showing the mouse cursor
(arrow) on the screens in which the mouse is disabled;
(b) adding a screen that identifies the six items/areas that
make up the test display; (c) simplifying and increasing
the speed of some of the examples; (d) changing the
Call Quality Control screen label to Quality Control; (e) and simplifying some words and sentences in
portions of the instructional text.
Changes were also made in parts of the instructions
in response to the data analyses. To reduce the number
of inappropriate responses due to double-clicking or
constant attempts to place boxes in the loading area,
instructions were added telling participants to click on
the mouse only once and to not double-click. Corrective
feedback also was added to the practice sequences that
appear in the middle of the instructions. For these
practice sequences, if a participant clicks more than

once to move a box to the loading area, the screen freezes
and the following corrective feedback appears: To
move a box to the loading area, you only need to click
on the box once. Do not double-click. The following
corrective feedback appears when a participant does not
place a box in the loading area when one is needed: The
computer moved a box into the loading area for you.
Keep track of new letters that appear on the conveyor
belt and move boxes to the loading area as soon as they
are needed. If the computer places a letter into a box for
the participant, the following corrective feedback appears: A letter fell off a conveyor belt and the computer
placed it in a box. Make sure you track the letters as they
move down the belts. The last corrective feedback
appears when a participant tries to place a letter in a box
that is not closest to being full: You need to place the
letters in the boxes that are closest to being full.
Finally, since the RI measure had been eliminated
from the test, any reference to RI was removed from the
instructions. Also, in view of the significant increase in
participant performance on the P/T measure on the
retest, three practice sequences were added at the end of
the instructions. This provides participants with a total
of six practice sequences at the end of the instructions.
Situational Awareness Items. SA items at the end of
each LFT sequence were revised on the basis of the item
analyses described above. The following subsections
below provide an overview of how item analyses guided
development of new SA items.
For all levels of SA, care was taken to not ask any
questions about letters below availability lines on the
belts. The reason was that examinees who worked
quickly would have completed tasks associated with
letters below availability lines; for such examinees, those
letters might not even be on the belts, but rather in
boxes. For all examinees, though, the letters above the
belts would all be in the same places.
Level 1 Items. Level 1 Situational Awareness items
assess an examinees ability to perceive elements in the
environment, together with their status, attributes, and
dynamics. The following are examples of item stems
(i.e., the questions) that item analysis supported:
Which belt moved letters the SLOWEST?
Which belt had the most letters ABOVE the availability line?
Which two belts have their availability lines closest to
the BOTTOM of the belt?
45
#:356
How many ORANGE letters were ABOVE the availability lines?

Which belt had no letters on it?
Consider the LAST box you should have placed in the

loading area. Which letter caused you to need to place
this last box?
How many (or which) boxes should be in the loading
area in order to correctly place all the letters?
Consider all the letters on the belts. What color were
the letters that, when combined, could fill at least one
box?
Listed below are examples of item stems that item

analysis did NOT support:
Which belt moved letters the FASTEST?
Which belt had its availability line CLOSEST to the
TOP of the belt?
Which letter was CLOSEST to the TOP of the belt?
Which letter was CLOSEST to crossing the availability line?
Listed below are examples of item stems that item

analysis did NOT support:
If all the letters were correctly placed in boxes, how
many empty spaces for Ds would be in the boxes in the
loading area?
If all letters were correctly placed in boxes, how many
more (or which) letters would be needed to completely
fill the GREEN box?
Item analysis, therefore, suggested that Level 1 SA is

more reliably measured in the LFT by asking about the
number and color of letters above the availability lines,
which belt was slowest, and which belts had their
availability lines closest to the bottom. These questions
are consistent with targeting items toward a lower
average level of planning and thinking ahead. Generally,
the items that did not contribute well to scale reliability
included those about the fastest belt, availability lines
closer to the top of the belt, and letters closer to the top
of the belt. Those items are more consistent with a
higher level of planning and thinking ahead. Therefore,
development of new Level 1 SA items was focused on the
types of areas listed above that required lower amounts
of planning and thinking ahead.
Level 2 Items. Level 2 Situational Awareness items
assess an examinees ability to comprehend the situation. This requires a synthesis of disjointed Level 1
elements. In asking about Level 2 SA, it is important to
assess comprehension of Level 1 elements while requiring no projection of the future status (a Level 3 factor).
In the LFT setting, it was also considered necessary to
clearly distinguish between letters already in boxes in the
loading areas, compared with letters that examinees
could place there. To make this distinction clear, short
(i.e., 30-second) scenarios where no letters ever crossed
any availability lines were developed. That way, no
letters were in boxes at the end of these short scenarios.
In these scenarios, examinees could only place boxes and
maintain their situational awareness.
The following are examples of Level 2 item stems
(i.e., the questions) that item analysis supported:
Item analysis, therefore, suggested that Level 2 SA is

more reliably measured in the LFT by asking about the
last box an examinee should have placed in the loading
area, the number or color of boxes that should be in the
loading area, and what color of letters could completely
fill a box. These questions are consistent with targeting
items toward the more immediate LFT concerns. Generally, the items that did not contribute well to Level 2
scale reliability included those about the number of
empty spaces for a particular letter, and how many more
or which letters were required to fill a particular color of
box. Those items are more consistent with a more finegrained level of situational awareness. Therefore, development of new Level 2 SA items was focused on the
types of areas listed above that required only an awareness of the more immediate LFT concerns.
Level 3 Items. Level 3 SA items assess an examinees
ability to project the future status or actions of the
elements in the environment. This requires a knowledge
of the status and dynamics of the elements, as well as a
comprehension of the situationboth Level 1 and
Level 2 SA. In asking about Level 3 SA for LFT
sequences, it is important to ensure that all examinees
are responding to the same situation. Although the
software stops at exactly the same place for all examinees, the quicker examinees may have placed more boxes
in the loading area or placed more letters in boxes. For
this reason, all Level 3 SA items were started with the
following two sentences separated as a paragraph before
the Level 3 question:
What color was the LAST box you should have placed
in the loading area in order to correctly place all the
letters into boxes?
46
#:357
Assume that you correctly placed all required boxes in the

loading area. Also assume that you correctly placed all the
letters that remained on the belts.
correction to the software was implemented on February 26. Of the 429 cases on which data were collected,
151 cases had complete data.
The intent of this introduction was to allow all

examinees to (at least mentally) begin from the same
situation. Item analysis showed that Level 3 SA is more
reliably measured in the LFT by asking about simple
situations. In the case of the LFT, that seemed to be a
situation having only two or three boxes in the loading
area. The following are examples of Level 3 item stems
(i.e., the questions) that item analysis supported for
those simple situations:
Case Elimination
Because all participants proceed at the same rate
during practice and test sequences, test completion time
could not be used to assess participants test-taking
motivation. Likewise, because the test software automatically writes out data for each item indicating whether
the participant correctly selected the item, no cases
should have missing data.
Unmotivated Participants. It was believed that unmotivated participants would respond to very few or
none of the items, or respond with irrelevant answers.
The number-correct scores were used to identify unmotivated participants. The distribution of the 429 participants is provided in Table 3.3.24. An examination of
the data showed that no participant simply sat at the
computer and allowed the software to progress on its
own. Each participant entered some appropriate responses, and each got at least a few items correct. The
lowest score shown was 22 out of the 162 questions
correct (13.6%). While there may have been participants who were not trying their best, this screening
algorithm was unable to identify participants who blatantly did nothing at all. Therefore, all cases were kept
for the analyses.
After the full boxes are removed, which boxes would

remain in the loading area?
Which letters would you need to complete an ORANGE box?
How many more letters would you need to fill all the
boxes remaining in the loading area?
If the next letter was a PURPLE A, which of the
following would be true?
Therefore, development of new Level 3 SA items was
focused on simple situations and used the types of
questions listed above.
The original plan was to measure three worker requirements using the LFT. Because the measure of
Recall From Interruption showed ceiling effects and
unreliable difference scores, it was recommended that
attempts to measure that worker requirement with this
test be abandoned. To more adequately measure the
worker requirements of Planning and Thinking Ahead
and Situational Awareness, lengthening the test to 93
minutes was recommended. This longer version includes doubling the number of practice sequences that
participants complete before they begin the test. It was
estimated that this extra practice would reduce the
practice effect observed between the LFT and the retest
LFT on a small (N = 184) subsample. This would help
ensure that participants perform at or near their ability
prior to beginning the test portion of the LFT.
Item Analyses
Table 3.3.25 presents findings from the reliability
analysis on the four test sequences (i.e., T1 to T4). The
three parts of the table show how the sequence reliabilities
measured by alpha differed as different groups of items
were deleted. The first part (With Change Items)
presents results that include all the items in each sequence. Each change item may be considered as two
items; the item is what was presented originally, and the
second is the item with the change in the bottom or
three-digit number. The middle columns include the
pre-change items and exclude the post-change items,
and the third part of the table removes both versions of
the change items (i.e., the original and the change part).
Notice, too, that the second and third parts of the table
show Actual and Expected alphas. The actual alphas are the results provided by the data. The expected
alphas are the ones estimated by the Spearman-Brown
formula if like items were deleted. In every case, the
alphas from the data are higher than the expected alphas.
This finding supports the notion that the change items
Scan Test
Data Collection/Software Problems
As the data collection proceeded on the Scan test, it
became clear that the software was not writing data for
change items nor was it recording item latencies. A
47
#:358
differ from the other items in the test. Not including

them in the scoring, therefore, should increase the alpha
test reliability.
Of the 166 remaining items in sequences T1 to T4,
only four items (i.e., items 359, 373, 376, and 410) had
item-total correlations less than .10. The alpha computed on the 162 items remaining was .959. This
supported computing a number correct score using
these 162 items for the scanning worker requirement.
Planes Test
Case Elimination
The Planes test consisted of three parts and cases were
eliminated from each part independently. The screening algorithms for each part were based on similar
premises.
Part 1 consisted of 48 items. Participants were eliminated from further analyses if any of three screening
criteria were satisfied. The first screen for this part was
a total latency less than 48 seconds. The second screen
was percent correct less than or equal to 40%. The final
screen was the skipping of six or more items. These
screening algorithms reduced the sample from 450 to
429 for Part 1.
The screening for Part 2 was similar. Participants
were eliminated from further analyses on these criteria:
(1). Part 2 total latency less than 1.2 minutes, (2). 40%
correct or less, or (3). missing data for six or more items.
These screening algorithms reduced the available sample
from 450 to 398 for Part 2.
Part 3, participants were eliminated on these criteria:
(1) Part 3 total latency less than 2.4 minutes, (2) 40%
correct or less, or (3). missing data for 12 or more items.
These screening algorithms reduced the available sample
from 450 to 366 for Part 3.
Participant elimination across all three test parts left
a final sample of 343 having data on all three parts.
Time Limit Analyses

Table 3.3.26 shows the distribution of test times for
participants in this sample; 95% completed the Scan
Test in 19.87 minutes or less. If we take 1.96 times the
standard deviation of test completion times (1.75) and
add that product to the mean test completion time
(16.92), we find that a 95th percentile participant might
take 20.35 minutes to complete the Scan Test. Due to
the obtained test reliability, it was recommended that
no change be made to the test time for the Scan test, with
21 minutes allocated for this test.
Items in the Scan test that change during their screen
presentation did not behave the same as other items in
the test. Eliminating those items improved estimates of
internal consistency reliability. After eliminating four
items that had poor item-total correlations, the 162
remaining items in the test (i.e., non-practice) portion
of the Scan test produced an alpha of .96. Therefore, we
recommend keeping the Scan test at its current length
and allocating 21 minutes for test completion.
Participant performance on the Scan test items is
affected by the configuration of other items presented
on the same screen so any change must be considered
carefully. As a class of items, the change items tended to
reduce the Scan test reliability. By eliminating the
changing nature of the change items, the test instructions could be simplified. However, eliminating those
items might make the test easier or change the test in
some other way. Therefore, it was recommended to keep
the items as they are presented initially (i.e., without the
changing feature) but not count them. A similar recommendation was made for the four items that showed
poor item-total correlations.
Item Analyses
Scale Reliabilities and Item Analyses. Reliability
analyses were conducted to identify items within each
part of the Planes test that contribute to internal consistency. The corrected item-total correlation was computed for each item within each part as was the overall
alpha for that part. Table 3.3.27 presents an overview of
the results of these reliability analyses.
The Planes test is not a new test, having been developed previously as the Ships test (Schemmer et al.,
1996). In its alpha text form, the number of items was
cut in half to meet the time allowed for it in the pretest.
In reducing the number of items, the same proportion
was kept for all item types. However, there are many
parallels between the items in each of the three parts of
the test; a particular item that may not work well in Part
1 might work very well in Parts 2 or 3. For these reasons and
because data from all three parts were to be used to develop
a residual score for the coordinating ability component of
multitasking, eliminating items based on poor item-total
correlations alone was not considered desirable.
48
#:359
Restoring the Planes test to its original length would

require doubling the number of items. Using the SpearmanBrown formula, the new reliabilities are estimated at .86 for
Part 1, .91 for Part 2, and .89 for Part 3.
Computing Residual Scores. Using number correct
scores from Planes Part 1 and Part 2, the regression
procedure outlined in Yee, Hunt, and Pellegrino (1991)
was followed to compute an estimate of the coordinating ability component of multitasking. First, the regression equation for predicting the Part 3 score was
computed. Then, the difference between the actual and
predicted scores for Part 3 was computed by subtracting
the predicted from the actual score. This residual score
estimates the coordinating ability aspect of multitasking.
Yee et al. argue that a necessary but not sufficient
condition for the residual scores to be useful is that they
must be reliable. As they indicate, the quantity (1-R2)
must be greater than zero after allowing for unreliability
in the performance measures involved. To show the
residual score as reliable, analysts corrected the test
scores for each of the three parts of the Planes test for
unreliability and created a new correlation matrix. Using this corrected correlation matrix, the multiple correlation was computed to predict Part 3 from Parts 1 and
2 (R = .506, R2 = .256). To the extent that this multiple
R2 is less than unity after correcting all performance
measures for unreliability, the residual scores may be
considered reliable.
In addition, analysts followed the procedure of Yee et
al. and compared the multiple R2 (computed on observed scores, R2 = .164) to the reliability of Part 3 (alpha
= .804). Both analyses supported the inference of residual score reliability. Finally, we used the reliabilities
of the predicted and actual scores to estimate the reliability of the residual score (r = .613). The reliability of
the coordinating ability score for a Planes test of twice
the length was estimated to be .65.
the 192 items in the three test parts. Doubling the

number of items, then, would increase the test time by
15.4 minutes, from 37 to 52 minutes.
Test Revisions
Following the alpha testing of the Air Traffic Controller Test, the Planes test was revised in several ways,
including test and practice length, test instructions,
response mode, and content.
Test Length. Part 3 was reduced to 48 from 96
questions, the one-minute breaks were cut to 30 seconds, and practice sessions were reduced from 24 to 12
questions.
Mode of Response. The mode of response was
changed for all three subtests. Parts 1 and 2 were
changed to keys labeled R for the red plane and W for
the white plane instead of the numeric keypad 1 key
to represent the red plane and 3 key on the numeric
keypad to represent the white plane. Part 3 changed to
keys labeled T for true and F for false, instead of using
1 and 3 of the numeric keypad to represent false and
true, respectively.
Test Content. The content of Part 3 was changed so
that all questions used double-negative statements
(e.g., It is not true that the white plane will not arrive
after the red plane.), thereby making Part 3 distinct
from Part 2. Previously, some questions in Part 3 were
like those in Part 2.
Instructions. The test instructions were simplified in
several places. Also, the instructions in the Answering
Test Items section were revised to correspond to the
changes made in mode of response (noted above).
The project team cut the number of items in each part
of the original Planes test in half for the alpha data
collection effort. This was done to meet project time
constraints. After completing reliability analyses, it was
clear that the test would benefit from restoring it to its
original length. Available test time in the beta version
was limited, however. As a result, the number of items
in Part 3 and in the practice sessions was cut in half. The
time allotted for breaks between the three test parts was
also halved.
Time Limit Analyses

Table 3.3.28 shows the distribution of test completion times for the Planes test. Ninety-five per cent of
participants completed the Planes test in 34.6 minutes
or less. When we take 1.96 times the standard deviation
(4.42) and add that product to the mean, we estimate a
slightly higher amount of time (36.4 minutes). A test
completion time of 37 minutes, then, seems appropriate
for the test at its current length. Of this 37 minutes, 95%
of participants completed instructions and practice in
21.6 minutes. This leaves 15.4 minutes for completing
The following Experiences Questionnaire analyses
were performed on data from the first 9 of the 12 days
of pilot testing at Pensacola in February, 1997. The total
49
#:360
N in this data set is 330. The last 2 days of pilot testing

included a large number of the ATCS students; performance of the ATCS students and performance of NonATCS students on the EQ have not been compared.
As can be seen, a large number of students gave one

or more random responses (108, or 32.8%). Whether
this indicates that the new random response items are
too difficult, or that a large number of students were not
attending very closely to the EQ (or other tests?) is
unclear. Students with two or more random responses
were removed from the data set, resulting in a screened
sample of 274 EQs available for further analysis.
EQ Format
The pilot test version of the EQ contained 201 items
representing 17 scales, including a Random Response
Scale. All items used the same set of five response
options: Definitely True, Somewhat True, Neither
True Nor False, Somewhat False, and Definitely False.
Time to Complete EQ
The mean amount of time required to complete the
EQ for the screened data set was 29.75 minutes (SD =
9.53, Range = 10-109). A few individuals finished in
approximately 10 minutes, which translates into roughly
3 seconds per response. The records of the fastest
finishers were checked for unusual response patterns
such as repeating response patterns or patterns of all the
same response (which would yield a high random response score anyway), and none were found. Thus, no
one was deleted from the data set due solely to time
taken to complete the test. It is not surprising to note
that the fastest finishers in the entire, unscreened sample
of 330 were deleted based on their scores on the random
response scale.
Data Screening
Three primary data quality screens are typically performed on questionnaires like the EQ: (a) a missing data
screen, (b) an unlikely virtues screen, and (c) a random
response screen. The missing data rule used was that if
more than 10% of the items on a particular scale were
missing (blank), that scale score was not computed. No
missing data rule was invoked for across-scale missing
data, so there could be a data file with, for example, all
scale scores missing. No one was excluded based on
responses to the unlikely virtues items, that is, those
items with only one likely response (Example: You
have never hurt someone elses feelings, where the only
likely response is Definitely False).
A new type of random response item was tried out in
the pilot test, replacing the more traditional, right/
wrong-answer type, such as Running requires more
energy than sitting still. There were four random
response items, using the following format: This item
is a computer check to verify keyboard entries. Please
select the Somewhat True response and go on to the next
item. The response that individuals were instructed to
select varied across the four items. A frequency distribution of the number of random responses (responses
other than the correct one) follows:
Number of
Random
Responses
0
1
2
3
4
N
222
52
34
18
4
330
Scale Scoring
EQ items were keyed 1 - 5, the appropriate items were
reversed (5 - 1), and the scale scores were computed as
(the mean item response) x 20, yielding scores ranging
from 20 to 100. The higher the score, the higher the
standing on the characteristic.
Descriptive Statistics and Reliability Estimates
Appendix B contains the descriptive statistics and
internal consistency reliabilities for 16 scales (Random
Response Scale excluded). The scale means were comfortingly low and the standard deviations were comfortingly high, relieving concerns about too little variance
and/or a ceiling effect. The Unlikely Virtues scale had
the lowest mean of all (51.85), as it should.
The scale reliabilities were within an acceptable range
for scales of this length and type. Most were in the .70s
and .80s. The two exceptions were Self Awareness (.55)
and Self-Monitoring/Evaluating (.54).
Four items had very low item-scale correlations, so
they were removed from their respective scales: Items 21
and 53 from the Decisiveness scale (item-scale correlations of -.02 and -.05 respectively), item 144 from the
Self-Monitoring/Evaluating scale (correlation of .04),
and item 163 from the Interpersonal Tolerance scale
Percent
67.3
15.8
10.3
5.5
1.2
100.0
50
#:361
(correlation of -.19). Each of these four items was

correlated with all of the other scales to see if they might
be better suited to another scale. Item 144 correlates .23
with the Interpersonal Tolerance scale, and its content
is consistent with that scale, so it has been moved. The
remaining three items either did not correlate high
enough with other scales, or the item content was not
sufficiently related to the other scales to warrant moving
them. These three items were deleted, and the descriptive statistics and internal consistency reliabilities were
rerun for the three scales affected by the item deletions/
moves. Appendix B contains the revised descriptive
statistics and internal consistency reliabilities for the
three scales affected.
At the item level, the means and standard deviations
were satisfactory. (Item means and SDs can be found in
the reliability output in Appendix B, The only items
with extreme values and/or low standard deviations were
on the Unlikely Virtues scale, which is as it should be.
Factor 1: Concentration, Tolerance for High Intensity,

Composure, Decisiveness, Sustained
Attention, and Flexibility.
Factor 2: Consistency of Work Behaviors, Interpersonal
Tolerance, and Self- Awareness.
Factor 3: Self-Monitoring/Evaluating and Working
Cooperatively.
Factor 4: Taking Charge, Self-Confidence, Task
Closure/Thoroughness, and Execution.
In the 4-factor solution, the first factor of the 2-factor
solution is split into two parts. One part (Factor 1)
contains scales related to maintaining attentional focus
and the ability to remain composed and flexible. The
other part (Factor 4) contains scales related to taking
charge of situations and following through. The second
factor in the 2-factor solution also split into two parts in
the 4-factor solution, although not quite so tidily.
Actually, Working Cooperatively correlates just about
equally with Factors 2 and 3 of the 4-factor solution. If
EQ predictor composites were to be created at this
point, the tendency would be toward three composites,
drawn from the 4-factor solution: Factor 1, Factor 4,
and the combination of Factors 2 and 3.
Scale Intercorrelations and Factor Analysis

Appendix B also contains EQ scale intercorrelations
and factor analysis output. Principal axis factor analysis
was used, and the 2-, 3-, and 4-factor solutions were
examined, with solutions rotated to an oblimin criterion. As can be seen in Appendix B, there is a large
positive manifold. Consequently, there is a large general
factor, and it is most likely that any other factors that
emerge will be at least moderately correlated.
In the 2-factor solution, the two factors correlate .75.
Factor 1 consists of Decisiveness, Concentration, SelfConfidence, Task Closure/Thoroughness, Taking
Charge, Execution, Composure, Tolerance for High
Intensity, Sustained Attention, and Flexibility. Factor 2
consists of Interpersonal Tolerance, Working Cooperatively, Consistency of Work Behaviors, Self-Awareness,
and Self-Monitoring/Evaluating. Although the high
correlation between these two factors indicates that a
general factor accounts for much of the variance in these
two factors, there is some unique variance. Factor 1
appears to reflect a cool, confident, decisive character;
Factor 2 appears to reflect a character that is self-aware
and works well with others.
In the 3-factor solution, the third factor does not
appear to add any useful information. The 4-factor
solution appears to be interpretable. In this solution, the
factors are comprised of the following scales:

The EQ results in the pilot test were promising. Most
of the scales looked good in terms of their means,
variances, and reliabilities. The two scales that were
weakest, psychometrically, were Self-Awareness and
Self-Monitoring/Evaluating.
Item analysis suggested that items 21, 53, and 163
should be deleted, and item 144 moved to a different
scale. If the EQ must be shortened, deletion of scales
rather than individual items seemed preferable, given
the high correlations between scales. However, even the
scales most highly correlated (e.g., Decisiveness and
Sustained Attention, r = .80, and Decisiveness and
Composure, r = .81) appear to be measuring somewhat
different constructs. Based on considerations including
the desired length of the beta version of the AT-SAT test
battery, a final decision was made to decrease the EQ to
175 items. The Self-Monitoring scale was deleted in its
entirety, and several scales were shortened slightly.
The issue of how to use the Unlikely Virtues scale
remained unresolved. Although the mean and standard
deviation for this scale appeared just as they should in
51
#:362
the pilot test, this sample did not provide any information about how much faking good would actually
occur in an applicant population.
For all measures except PCTDEST, the next step was

to define a new scaling of each of these variables so that
higher scores indicated better performance and so that
the scale would be most sensitive to differences at higher
levels of performance. In the initial scaling, the difference between 0 and 1 error was treated the same as the
difference between 50 and 51 errors, even though the
former is a much more important distinction. The
transformations used were of the form:

The Air Traffic (AT) Scenarios test consisted of two
brief practice scenarios of 4 to 5 minutes each, and four
test scenarios of 15 to 20 minutes each. One-fourth of
the examinees that completed the AT test were also
given a seventh (retest) test scenario at the end of the day.
Two types of scores were recorded for each trial. First,
there were counts of different types of errors, including
crashes and separation errors (plane-to-plane and planeto-boundary) and procedural errors (wrong destination,
landing/exit speed, or landing/exit altitude). Second,
there were efficiency measures expressed in terms of
percentage of aircraft reaching their target destination
and delays in getting the aircraft to their destination and
in accepting handoffs.
New Scale = 1 / ( a + b*Old Scale)
where a and b were chosen so that optimal performance would be around 100 and performance at the
average of the old scale would map onto 50. For the AT
Test, optimal performance was indicated by 0 on each
of the original measures so that the transformation
could be rewritten as:
New Scale = 100 / (1 + Old Scale/Old Mean).
Scoring
Initial inspection of the results suggested that crashes
and separation errors (safety) were relatively distinct
from (uncorrelated with) procedural errors. Consequently, four separate scores were generated to account
for the data. Initial scores were:
CRASHSEP = crashes + separation errors
PROCERR = total number of procedural errors of all kinds
PCTDEST = percent reaching target destination
TOTDELAY = total delay (handoff and enroute)
It was also decided to scale each trial separately. The

last two trials were considerably more difficult than the
preceding ones, so variance in performance was much
higher for these trials. If the data from each trial were not
rescaled separately, the last trials would receive most of
the effective weight when averages were computed.
Consequently, the means referred to in the above formula were trial-specific means. The new scale variables
for each trial had roughly equivalent means and variances
which facilitated cross-trial comparisons and averaging.
In computing safety errors, crashes were initially

given a weight of 4.0 to equalize the variance of crashes
and separation errors. Since crashes are relatively rare
events, overweighting crashes led to reduced consistency across trials (reliability). Alternative weightings
might be explored at a later date, but would be expected
to make little difference. Consequently, it was decided
to simply count crashes as an additional separation
error.
One other note about the initial computation of
scores is that airport flyovers were initially listed with
separation errors but appeared to behave more like
procedural errors. Examinees are not given the same
type of warning signal when an aircraft approaches an
airport as when it approaches another plane or a boundary, so avoiding airport flyovers was more a matter of
knowing and following the rules.
Case Elimination
During the initial analyses, prior to rescaling, there
were several cases with very high error rates or long delay
times that appeared to be outliers. The concern was that
these individuals did not understand the instructions
and so were not responding appropriately. (In one case,
it was suspected that the examinee was crashing planes
on purpose.) The rescaling, however, shrunk the high
end (high errors or long times) of the original scales
relative to the lower end, and after rescaling these cases
were not clearly identifiable as outliers. Inspection of the
data revealed that all of the cases of exceptionally poor
performance occurred on the last test trial. The fact that
the last trial was exceptionally difficult and that similar
problems were not noted on the earlier trials, suggested
that most of these apparent outliers were simply instances of low ability and not random or inappropriate
52
#:363
responding. In the end, cases with more than 125 crash/

separation errors or more than 400 minutes of total
delay time were flagged as probable random (inappropriate) responders. A total of 16 cases were so flagged.
There were a number of instances of incomplete data.
The alpha pilot version of the software was not completely shock-proofed, and some examinees managed to
skip out of a trial without completing it. This rarely
happened on either the first or the last (fourth test) trial.
Where there was only one missing trial, averages were
computed across the remaining trials. Where more than
one trial was missing, the overall scores were set to
missing as well. In the end, we also flagged cases missing
either of the last two test trials. A total of 38 cases were so
flagged, leaving 386 cases with no flags for use in analyses.
Another measure of reliability was the correlation

between the overall scores generated during the regular
testing and the retest scores for those examinees who
completed an additional trial at the end of the day. Table
3.3.30 shows the correlation between alternative composite scores and the retest score. The alternative composites included means across trials, possibly leaving out
the first few trials, and a weighted composite giving
increasing weight to the later composites. (For AT, the
weights were 0 for the practice trials and 1, 2, 3, and 4
for the test trials. For the TW test, the weights were 1,
2, and 3 for the three regular trials.) The row labeled
SEPSK1-6, for example, corresponds to the simple
mean across all six (two practice and four test) trials.
Since the retest was a single trial and, in most cases, the
composite score from regular testing encompassed more
than one trial, the two measures being correlated do not
have equal reliability. In general, as expected, the correlations ranged between the values estimated for single
trial reliabilities and reliability estimates based on the
number of trials included in the composite scores. In
some cases, these test-retest reliabilities were lower
than the internal consistency estimates, indicating some
individual differences in the retention of skill over the
course of the testing day.
Based on analyses of the reliability data, it was
concluded that the most appropriate scores for use with
the pilot data were averages of the last two test trials. The
2-trial reliability for these scores was higher than the
three-trial reliability for the last 3 scores, the 4-trial
reliability for the last four scores, and so on. The
composite based on the last two trials also had the
highest correlation with the retest scores in most cases or
was at least a close second.
Reliability
After revised scale scores were computed for each
trial, reliability analyses were performed. In this case, an
ANOVA (generalizability) model was used to examine
the variance in scores across trials, examinee groups (test
orders), and examinees (nested within groups). The
analyses were conducted for varying numbers of trials,
from all six (two practice and four test) down to the last
two (test) trials. Table 3.3.29 shows variance component estimates for each of the sources of variation.
Notwithstanding modest efforts to standardize across
trials, there was still significant variation due to Trial
main effects in many cases. These were ignored in
computing reliabilities (using relative rather than absolute measures of reliability) since the trials would be
constant for all examinees and would not contribute to
individual variation in total scores. Similarly, Group
and Group by Trial effects were minimal and were not
included in the error term used for computing
reliabilities. Group effects are associated with different
positions in the overall battery. There will be no variation of test position in the final version of the battery.
Single trial reliabilities were computed as the ratio of
the valid variance due to subjects nested within groups,
SSN(Group) to the total variance, expressed as the sum
of SSN(Group) and SSN*Trial. For each variable, the
single trial reliability based on the last two trials was
identical to the correlation between the scores for those
two trials. Reliabilities for means across higher numbers of trials were computed by dividing the SSN*T
error component by the number of trials. This is
exactly the Spearman-Brown adjustment expressed
in generalizability terms.

It was felt if separate scores were to be used in the
concurrent validation, additional practice and test trials
would be needed to achieve a high level of reliability for
the Separation Skill variable. It was recommended
that three practice trials be used with each trial targeted
to test understanding of specific rules and more tailored feedback after each trial. For example, the first
trial might include four planes, two headed for their
designated airport runways and two headed for their
designated exit gates. One of the two exiting planes
would be at the wrong level and the other at the wrong
speed. Similarly, one of the landing planes would be at
the wrong level and the other at the wrong speed. No
changes in direction would be required. At the end of a
53
#:364
very brief time, it could be determined whether the

examinee changed level and speed appropriately for
each aircraft, with feedback if they did not.
A second example might involve turning planes to get
to their destinations. Specific feedback on changing
directions would be given if the planes failed to reach
their assigned destination. Further testing of speed and
level rules would also be included. The final practice
scenario would involve airport landing directions and
flyovers.
Following the three practice scenarios (which might
take a total of 10 minutes to run with another 10
minutes for feedback), five test scenarios with increasing
difficulties were proposed. The alpha fourth test scenario may be a bit too difficult and might be toned down
a little. However, controller performance is expected to
be at a much higher level, so at least two relatively
difficult scenarios should be included. After three practice and three easier test scenarios, performance on the
last two more difficult scenarios should be quite reliable.
The three scores analyzed for the TW test were (a)

Pattern Recognition Accuracy (PRACCY), defined as
the percent of correct pattern matching responses out of
all correct and incorrect responses (e.g., excluding timeouts); (b) Pattern Recognition Speed (PRSPD), a transformation of the average time, in milliseconds, for
correct responses; and (c) Time Wall Accuracy
(TWACCY), a transformation of the mean absolute
time error, in milliseconds. The transformations used
for Pattern Recognition Speed and Time Wall Accuracy
were identical in form to those used with the AT test. In
this case, however, the transformations mapped the
maximum value to about 100 and the mean value to
about 50 across all trials, rather than using a separate
transformation for each trial. This was done because the
trials did not vary in difficulty for TW as they did for AT.
Case Elimination
Figure 3.3.1 shows a plot of the Pattern Recognition
Accuracy and Speed variables. A number of cases had
relatively high speed scores and lower than chance
(50%) accuracy scores. In subsequent analyses, all cases
with an accuracy score less than 40 on any of the
operational trials were deleted. This resulted in a deletion of 12 participants.
Software Changes
After the Alpha Version pilot test, the AT test was
changed to have more extensive and more highly edited
instructions and was converted to a 32-bit version to run
under Windows 95. The practice scenarios were modified to teach specific aspects of the exercises (changing
speed and level in practice 1, changing directions in
practice 2, noticing airport landing directions, and
coping with pilot readback errors in practice 3). Specific
feedback was provided after each practice session keyed to
aspects of the examinees performance on the practice trial.
The new version of the scenario player provided
slightly different score information. In particular the
en route delay variable was computed as the total en
route time for planes that landed correctly. We modified the shell program to read the replay file and copy
information from the exit records (type XT) into the
examinees data file. This allowed us to record which
planes either crashed or were still flying at the end of the
scenario. We computed a total en route time to
replace the delay time provided by the Alpha version.
Reliability
Tables 3.3.29 and 3.3.30 show internal consistency
and test-retest reliability estimates for TW as well as for
AT. Analyses of these data suggested that averaging
across all three trials led to the most reliable composite
for use in analyses of the pilot data.
Time Wall Accuracy reliability estimates were modest,
although the test-retest correlations held up fairly well.
Preliminary results suggested that five or six trials may be
needed to get highly reliable results on all three measures.
Software Changes
The trial administration program was changed to allow
us to specify the number of Time Wall items administered
and to shut off the warm up trials for each administration.
The main program then called the trial administration
program 6 times. The first three trials had 5 Time Wall
items each and were considered test trials. The next three
trials had 25 Time Wall items each and were considered test
trials. After the practice trials, the examinees performance
was analyzed and specific feedback was given on how to
improve their score.

The analyses for the Time Wall (TW) test were very
similar to the analyses performed for the Air Traffic
Scenarios test. One difference was that TW had three
exactly parallel trials instead of two practice and four test
scenarios that differed in difficulty. Each TW trial had
a brief practice trial where no results were recorded.
54
#:365
Testing Times
Table 3.3.31 shows distributional statistics for instruction time and total time for the AT and TW tests
in their current form. While there was some variation in
instruction time, the total times were quite close to the
original targets (90 and 25 minutes, respectively).
Alley, V., Ammerman, H., Fairhurst, W., Hostetler,

C., & Jones, G. (1988, July). FAA air traffic
control operation concepts: Volume V, ATCT/
TCCC tower controllers (DOT/FAA/AP-87/01,
Vol. 5, CHG 1). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration.
Conclusions
The purpose of the pilot study was to determine if the
predictor battery required revisions prior to its use in the
proposed concurrent validation study. A thorough analysis of the various tests was performed. A number of
recommendations related to software presentation item changes, and predictor construct revisions - were
outcomes of the pilot study. The project team believed
that the changes made to the test battery represented a
substantial improvement over initial test development.
The beta battery, used in the concurrent validation
study, was a professionally developed set of tests that
benefited greatly from the pilot study.
Ammerman, H., Bergen, L., Davies, D., Hostetler, C.,

Inman, E., & Jones, G. (1987, November). FAA
air traffic control operation concepts: Volume VI,
ARTCC/HOST en route controllers (DOT/FAA/
AP-87/01, Vol. 6). Washington, DC: U.S. Department of Transportation, Federal Aviation
Administration.
Ammerman, H., Fairhurst, W., Hostetler, C., & Jones,
G. (1989, May). FAA air traffic control task
knowledge requirements: Volume I, ATCT tower
controllers (DOT/FAA/ATC-TKR, Vol. 1).
Washington, DC: U.S. Department of Transportation, Federal Aviation Administration.
REFERENCES
Ammerman, H., Fligg, C., Pieser, W., Jones, G.,

Tischer, K., Kloster, G. (1983, October).
Enroute/terminal ATC operations concept
(DOT/FAA/AP-83/16) (CDRL-AOO1 under
FAA contract DTFA01-83-Y-10554). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Advanced Automation Program Office.
Aerospace Sciences, Inc. (1991). Air traffic control

specialist pre-training screen preliminary validation. Fairfax, VA: Aerospace Sciences, Inc.
Alexander, J., Alley, V., Ammerman, H., Fairhurst, W.,
Hostetler, C., Jones, G., & Rainey, C. (1989,
April). FAA air traffic control operation concepts:
Volume VII, ATCT tower controllers (DOT/
FAA/AP-87/01, Vol. 7). Washington, DC: U.S.
Department of Transportation, Federal Aviation
Administration.
Bobko, P., Nickels, B. J., Blair, M. D., & Tartak, E. L.

(1994). Preliminary internal report on the current
status of the SACHA model and task interconnections: Volume I.
Boone, J. O. (1979). Toward the development of a new
selection battery for air traffic control specialists.
(DOT/FAA/AM-79/21). Washington, DC: U.S.
Administration, Office of Aviation Medicine.
Alexander, J., Alley, V., Ammerman, H., Hostetler, C.,

& Jones, G. (1988, July). FAA air traffic control
operation concepts: Volume II, ACF/ACCC terminal and en route controllers (DOT/FAA/AP87/01, Vol. 2, CHG 1). Washington, DC: U.S.
Administration.
Boone, J., Van Buskirk, L., & Steen, J. (1980). The

Federal Aviation Administrations radar training
facility and employee selection and training
Alexander, J., Ammerman, H., Fairhurst, W., Hostetler,

C., & Jones, G. (1989, September). FAA air
traffic control operation concepts: Volume VIII,
TRACON controllers (DOT/FAA/AP-87/01,
Vol. 8). Washington, DC: U.S. Department of
Transportation, Federal Aviation Administration.
Borman, W. C. (1979). Format and training effects on

rating accuracy and rater errors. Journal of Applied Psychology, 64, 410-421.
55
#:366
Borman, W. C., Hedge, J. W., & Hanson, M. A.

(1992, June). Criterion development in the
SACHA project: Toward accurate measurement
of air traffic control specialist performance (Institute Report #222). Minneapolis: Personnel Decisions Research Institutes.
Buckley, E. P., DeBaryshe, B. D., Hitchner, N., &

Kohn, P. (1983). Methods and measurements in
real-time air traffic control system simulation
(DOT/FAA/CT-83/26). Atlantic City, NJ: U.S.
Administration, Technical Center.
Boone, J. O (1979). Toward the development of a new

selection battery for air traffic control specialists
Buckley, E. P., House, K., & Rood, R. (1978). Development of a performance criterion for air traffic
control personnel research through air traffic control simulation. (DOT/FAA/RD-78/71). Washington, DC: U.S. Department of Transportation,
Federal Aviation Administration, Systems Research and Development Service.
Broach, D. & Brecht-Clark, J. (1994). Validation of

the Federal Aviation Administration air traffic
control specialist pre-training screen (DOT/FAA/
AM-94/4). Oklahoma City, OK: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Buckley, E. P., OConnor, W. F., & Beebe, T. (1969).

A comparative analysis of individual and system
performance indices for the air traffic control
system (Final report) (DOT/FAA/NA-69/40;
DOT/FAA/RD-69/50; Government accession
#710795). Atlantic City, NJ: U.S. Department of
Transportation, Federal Aviation Administration,
National Aviation Facilities Experimental Center, Systems Research and Development Service.
Broach, D. (1996, November). Users Guide for v4.0 of

the Air Traffic Scenarios Test for Windows(
(WinATST). Oklahoma City, OK: Federal Aviation Administration Civil Aeromedical Institute,
Human Resources Research Division.
Buckley, E. P., OConnor, W. F., Beebe, T., Adams,

W., & MacDonald, G. (1969). A comparative
analysis of individual and system performance
indices for the air traffic control system (DOT/
FAA/NA-69/40). Atlantic City, NJ: U.S. Department of Transportation, Federal Aviation Administration, Technical Center.
Brokaw, L. D. (1957, July). Selection measures for air

traffic control training. (Technical Memorandum PL-TM-57-14). Lackland Air Force Base,
TX: Personnel Laboratory, Air Force Personnel
and Training Research Center.
Brokaw, L. D. (1959). School and job validation of
selection measures for air traffic control training.
(WADC-TN-59-39). Lackland Air Force Base,
TX: Wright Air Development Center, United
States Air Force.

performance indices for the air traffic control
system (DOT/FAA/NA-69/40). Atlantic City,
N.J: U.S. Department of Transportation, Federal
Aviation Administration, National Aviation Facilities Experimental Center.
Brokaw, L. D. (1984). Early research on controller

selection: 1941-1963. In S. B. Sells, . T. Dailey,
E. W. Pickrel (Eds.) Selection of Air Traffic
Controllers. (DOT/FAA/AM-84/2). Washington,
DC: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Cattell, R. B., & Eber, H. W. (1962). The sixteen

personality factor questionnaire. Champaign, IL:
Institute for Personality and Ability Testing.
Carter, D. S. (1979). Comparison of different shrinkage formulas in estimating population umultiple
correlation coefficients. Educational and Psychological Measurement, 39, 261-266.
Buckley, E. P., & Beebe, T. (1972). The development

of a motion picture measurement instrument for
aptitude for air traffic control (DOT/FAA/RD71/106). Washington, DC: U.S. Department of
Systems Research and Development Service.
56
#:367
Cobb, B. B. (1967). The relationships between chronological age, length of experience, and job performance ratings of air route traffic control specialists (DOT/FAA/AM-67/1). Oklahoma City, OK:
U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Ekstrom, R. B., French, J. W., Harman, H. H., &

Dermen, D. (1976). Manual for Kit of FactorReferenced Cognitive Tests. Princeton, NJ: Educational Testing Service.
Fleishman, E.A., & Quaintance, M.K. (1984). Taxonomies of human performance. Orlando, FL:
Academic Press.
Cobb, B. B. & Mathews, J. J. (1972). A proposed new test

for aptitude screening of air traffic controller applicants. (DOT/FAA/AM-72/18). Washington, DC:
Gibb, G.D., Smith, M.L., Swindells, N., Tyson, D.,

Gieraltowski, M.J., Petschauser, K.J., & Haney,
D.U. (1991). The development of an experimental selection test battery for air traffic control
specialists. Daytona Beach, FL.
Collins, W. E., Manning, C. A., & Taylor, D. K.

(1984). A comparison of prestrike and poststrike
ATCS trainees: Biographic factors associated with
Academy training success. In A. VanDeventer,,
W. Collins, C. Manning, D. Taylor, & N. Baxter
(Eds.) Studies of poststrike air traffic control
specialist trainees: I. Age, biographical actors, and
selection test performance. (DOT/FAA/AM-84/
18). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Hanson, M. A., Hedge, J. W., Borman, W. C., &

Nelson, L. C. (1993). Plans for developing a set
of simulation job performance measures for air
traffic control specialists in the Federal Aviation Administration. (Institute Report #236).
Minneapolis, MN: Personnel Decisions Research Institutes.
Hedge, J. W., Borman, W. C., Hanson, M. A., Carter,
G. W., & Nelson, L. C. (1993). Progress toward
development of ATCS performance criterion
measures. (Institute Report #235). Minneapolis,
MN: Personnel Decisions Research Institutes.
Collins, W. E., Nye, L. G., & Manning, C. A.. (1990).

Studies of poststrike air traffic control specialist
trainees: III. Changes in demographic characteristics of Academy entrants and bio-demographic
predictors of success in air traffic control selection
and Academy screening. (DOT/FAA/AM-90/4).
Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office
of Aviation Medicine.
Hogan, R. (1996). Personality Assessment. In R.S.

Barrett (Ed.), Fair Employment in Human Resource Management (pp.144-152). Westport,
Connecticut: Quorum Books.
Houston, J.S., & Schneider, R.J. (1997). Analysis of
Experience Questionnaire (EQ) Beta Test Data.
Unpublished manuscript.
Convey, J. J. (1984). Personality assessment of ATC

applicants. In S. B. Sells, J. T. Dailey, E. W.
Pickrel (Eds.) Selection of Air Traffic Controllers. (DOT/FAA/AM-84/2). Washington, DC:
Human Technology, Inc. (1991). Cognitive task analysis

of en route air traffic controller: Model extension
and validation (Report No. OPM-87-9041).
McLean, VA: Author.
Human Technology, Inc. (1993). Summary Job Analysis. Report to the Federal Aviation Administration Office of Personnel, Staffing Policy Division. Contract #OPM-91-2958, McLean, VA:
Author.
Cooper, M., Blair, M. D., & Schemmer, F.M. (1994).

(SACHA) Draft Preliminary Approach Predictors Vol 1: Technical Report . Bethesda, MD:
University Research Corporation.
Landon, T.E. (1991). Job performance for the en-route

ATCS: A review with applications for ATCS
selection. Paper submitted to Minnesota Air Traffic Controller Training Center.
Costa, P.T., Jr., & McCrae, R.R. (1988). Personality in

Adulthood: A six-year longitudinal study of selfreports and spouse ratings on the NEO personality inventory. Journal of Personality and Social
Psychology, 54, 853-863.
57
#:368
Manning, C. A. (1991). Individual differences in air

traffic control specialist training performance.
Journal of Washington Academy of Sciences, 11,
101-109.
Potosky, D. , & Bobko, P. (1997). Assessing computer

experience: The Computer Understanding and
Experience (CUE) Scale. Poster presented at the
Society for Industrial and Organizational Psychology (SIOP), April 12, St. Louis, MO.
Manning, C. A. (1991). Procedures for selection of air

traffic control specialists. In H. Wing & C. Manning (Eds.) Selection of air traffic controllers:
Complexity, requirements and public interest.
Pulakos, E. D. (1984). A comparison of rater training

programs: Error training and accuracy training.
Journal of Applied Psychology, 69, 581-588.
Pulakos, E. D. (1986). The development of a training
program to increase accuracy with different rating
formats. Organizational Behavior and Human
Decision Processes, 38, 76-91.
Manning, C.A., Della Rocco, P. S., & Bryant, K. D.

(1989). Prediction of success in air traffic control
field training as a function of selection and screening test performance . (DOT/FAA/AM-89/6).
Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office
Pulakos, E. D., & Borman, W. C. (1986). Rater orientation and training. In E. D. Pulakos & W. C.
Borman (Eds.), Development and field test report
for the Army-wide rating scales and the rater
orientation and training program (Technical Report #716). Alexandria, VA: U.S. Army Research
Institute for the Behavioral and Social Sciences.
Mecham, R.C., & McCormick, E.J. (1969). The rated

attribute requirements of job elements in the
position analysis questionnaire. Office of Naval
Research Contract Nonr-1100 (28), Report No.
1. Lafayette, Ind.: Occupational Research Center,
Purdue University.
Pulakos, E. D, Keichel, K. L., Plamondon, K., Hanson,

M. A., Hedge, J. W., & Borman, W. C. (1996).
SACHA task 3 final report. (Institute Report
#286). Minneapolis, MN: Personnel Decisions
Research Institutes.
Mies, J., Coleman, J. G., & Domenech, O. (1977).

Predicting success of applicants for positions as
air traffic control specialists in the Air Traffic
Service (Contract No. DOT FA-75WA-3646).
Washington, DC: Education and Public Affairs, Inc.
Rock, D. B., Dailey, J. T., Ozur, H., Boone, J. O., &

Pickerel, E. W. (1978). Study of the ATC job
applicants 1976-1977 (Technical Memorandum
PL-TM-57-14). In S. B. Sells, J.T. Dailey, & E.
W. Pickrel (Eds.), Selection of air traffic controllers (pp. 397-410). (DOT/FAA/AM-84/2).Oklahoma City, OK: U.S. Department of Transportation, Federal Aviation Administration, Office of
Aviation Medicine.
Milne, A. M. & Colmen, J. (1972). Selection of air

traffic controllers for FAA. Washington, DC:
Education and Public Affairs, Inc. (Contract No.
DOT=FA7OWA-2371).
Schemmer, F.M., Cooper, M.A., Blair, M.D., Barton,

M.A., Kieckhaefer, W.F., Porter, D.L., Abrahams,
N. Huston, J. Paullin, C., & Bobko, P. (1996).
(SACHA) Interim Approach Predictors Volume
1: Technical Report. Bethesda, MD: University
Research Corporation.
Myers, J., & Manning, C. (1988). A task analysis of the

Automated Flight Service Station Specialist job
and its application to the development of the
Screen and Training program (Unpublished
manuscript). Oklahoma City, OK: Civil Aeromedical Institute, Human Resources Research
Division.
Schroeder, D. J., & Dollar, C. S. (1997). Personality

characteristics of pre/post-strike air traffic control applicants. (DOT/FAA/AM-97/17). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office
Nickels, B.J., Bobko, P., Blair, M.D., Sands, W.A., &

Tartak, E.L. (1995). Separation and control hiring assessment (SACHA) final job analysis report
(Deliverable Item 007A under FAA contract
DFTA01-91-C-00032). Washington, DC: Federal
Aviation Administration, Office of Personnel.
58
#:369
Schroeder, D. J., Dollar, C. S., & Nye, L. G. (1990).

Correlates of two experimental tests with performance in the FAA Academy Air Traffic Control
Nonradar Screen Program. (DOT/FAA/AM-90/
Trites & Cobb (1963.) Problems in air traffic management: IV. Comparison of pre-employment jobrelated experience with aptitude test predictors of
training and job performance of air traffic control
specialists. (DOT/FAA/AM-63/31). Washington,
Shrout, P.E., & Fleiss, J.L. (1979). Intraclass correlations: Uses assessing rater reliability. Psychological Bulletin, 86, 420-428.
Tucker, J. A. (1984). Development of dynamic paperand-pencil simulations for measurement of air

traffic controller proficiency (pp. 215-241). In S.
B. Sells, J. T. Dailey & E. W. Pickrel (Eds.),
Selection of air traffic controllers (DOT/FAA/
Sollenberger, R. L., Stein, E. S., & Gromelski, S.

(1997). The development and evaluation of a
behaviorally based rating form for assessing air
traffic controller performance (DOT/FAA/CTTN96-16). ). Atlantic City, NJ: U.S. Department
of Transportation, Federal Aviation Administration, Technical Center.
VanDeventer, A. D. (1983). Biographical profiles of

successful and unsuccessful air traffic control specialist trainees. In A. VanDeventer, D. Taylor, W.
Collins, & J. Boone (Eds.) Three studies of biographical factors associated with success in air
traffic control specialist screening/training at the
FAA Academy. (DOT/FAA/AM-83/6). Washington, DC: U.S. Department of Transportation,
Federal Aviation Administration, Office of Aviation Medicine.
Stein, E. S. (1992). Simulation variables. Unpublished

manuscript.
Taylor, M.V., Jr. (1952). The development and validation of a series of aptitude tests for the selection of
personnel for positions in the field of Air Traffic
Control. Pittsburgh, PA: American Institutes for
Research.
Taylor, D. K., VanDeventer, A. D., Collins, W. E., &
Boone, J. O. (1983). Some biographical factors
associated with success of air traffic control specialist trainees at the FAA Academy during 1980.
In A. VanDeventer, D. Taylor, W. Collins, & J.
Boone (Eds.) Three studies of biographical factors associated with success in air traffic control
specialist screening/training at the FAA Academy.
Weltin, M., Broach, D., Goldbach, K., & ODonnell,

R. (1992). Concurrent criterion related validation of air traffic control specialist pre-training
screen. Fairfax, VA: Author.
Wherry, R.J. (1940). Appendix A. In W.H.Stead, &
Sharyle (Eds.), C.P. Occupational Counseling
Techniques.
Yee, P. L., Hunt, E., & Pellegrino, J. W. (1991).
Coordinating cognitive information: Task effects
and individual differences in integrating information from several sources. Cognitive Psychology,
23, 615-680.
Trites, D. K. (1961). Problems in air traffic management: I. Longitudinal prediction of effectiveness

of air traffic controllers. (DOT/FAA/AM-61/1).
Oklahoma City, OK: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
59
#:370
DOT/FAA/AM-01/6

Documentation of Validity
for the AT-SAT
Computerized Test Battery
Volume II
R.A. Ramos
Human Resources Research Organization
Michael C. Heil
Carol A. Manning
Civil Aeromedical Institute
March 2001
Final Report
This document is available to the public

through the National Technical Information
Service, Springfield, Virginia 22161.
U.S. Department
of Transportation
Federal Aviation
Administration
#:371
N O T I C E
This document is disseminated under the sponsorship of
the U.S. Department of Transportation in the interest of
information exchange. The United States Government
#:372

1. Report No.
DOT/FAA/AM-01/6
5. Report Date
Documentation of Validity for the AT-SAT Computerized Test

Battery, Volume II
March 2001
7. Author(s)

1
Ramos, R.A. , Heil, M.C. , and Manning, C.A.

1
Human Resources Research

Organization
68 Canal Center Plaza, Suite 400
FAA Civil Aeromedical Institute

P. O. Box 25082

800 Independence Ave., S. W.
Work was accomplished under approved subtask AM-B-99-HRR-517

16. Abstract
This document is a comprehensive report on a large-scale research project to develop and validate a
computerized selection battery to hire Air Traffic Control Specialists (ATCSs) for the Federal Aviation
Administration (FAA). The purpose of this report is to document the validity of the Air Traffic Selection
and Training (AT-SAT) battery according to legal and professional guidelines. An overview of the project
is provided, followed by a history of the various job analyses efforts. Development of predictors and
criterion measures are given in detail. The document concludes with the presentation of the validation of
predictors and analyses of archival data.
17. Key Words
Air Traffic Controllers, Selection, Assessment,

Job Analyses
Document is available to the public through the

National Technical Information Service,
Springfield, Virginia 22161
Unclassified
Unclassified
Form DOT F 1700.7 (8-72)
21. No. of Pages
22. Price
179
#:373
TABLE OF CONTENTS
VOLUME II
Page
CHAPTER 4 - DEVELOPMENT OF CRITERION MEASURES OF AIR TRAFFIC CONTROLLER PERFORMANCE ................... 1
CBPM ..................................................................................................................................................... 1
CHAPTER 5.1 - FIELD PROCEDURES FOR CONCURRENT VALIDATION STUDY ....................................................... 13
CHAPTER 5.2 - DEVELOPMENT OF PSEUDO-APPLICANT SAMPLE ......................................................................... 17
CHAPTER 5.3 - DEVELOPMENT OF DATA BASE ................................................................................................. 21
CHAPTER 5.4 - BIOGRAPHICAL AND COMPUTER EXPERIENCE INFORMATION: DEMOGRAPHICS FOR THE VALIDATION
STUDY .................................................................................................................................................. 31
Total Sample ....................................................................................................................................... 31
Controller Sample ............................................................................................................................... 31
Pseudo-Applicant Sample .................................................................................................................... 32
Computer Use and Experience Questionnaire ..................................................................................... 32
Performance Differences ...................................................................................................................... 33
Relationship Between Cue-Plus and Predictor Scores .......................................................................... 33
Summary ............................................................................................................................................. 35
CHAPTER 5.5 - PREDICTOR-CRITERION ANALYSES ............................................................................................. 37
CHAPTER 5.6 - ANALYSES OF GROUP DIFFERENCES AND FAIRNESS ..................................................................... 43
CHAPTER 6 - THE RELATIONSHIP OF FAA ARCHIVAL DATA TO AT-SAT PREDICTOR AND CRITERION MEASURES .. 49
Previous ATC Selection Tests .............................................................................................................. 49
Other Archival Data Obtained for ATC Candidates ........................................................................... 51
Archival Criterion Measures ................................................................................................................ 52
Historical Studies of Validity of Archival Measures ............................................................................. 52
Relationships Between Archival Data and AT-SAT Measures .............................................................. 54
REFERENCES ................................................................................................................................................ 61
List of Figures and Tables
Figures
Figure 4.1.
Figure 4.2.
Figure 4.3.
Figure 4.4.
Figure 5.2.1.
Figure 5.2.2.
Figure 5.3.1.
Figure 5.3.2.
Figure 5.5.1.
Figure 5.5.2.
Map of CBPM Airspace ............................................................................................................. 67

Airspace Summary: Sector 05 in Hub Center ............................................................................ 68
Example CBPM Item ................................................................................................................ 69
Aero Center Airspace ................................................................................................................. 70
Sample Classified Newspaper Advertisement for Soliciting Civilian Pseudo-Applicants ............ 70
Sample flyer advertisement for soliciting civilian pseudo-applicants .......................................... 71
AT-SAT Data Base (*) ................................................................................................................ 72
CD-ROM Directory Structure of AT-SAT Data Base ................................................................ 73
Expected Performance: OPM vs. AT-SAT .................................................................................. 74
Percentage of Selected Applicants whose Expected Performance is in the Top Third of Current
Controllers: OPM vs. AT-SAT ................................................................................................... 75
Figure 5.6.1. Fairness Regression for Blacks Using AT-SAT Battery Score and Composite Criterion .............. 75
iii
#:374
Figure 5.6.2.
Figure 5.6.3.
Figure 5.6.4.
Figure 5.6.5.
Figure 5.6.6.
Fairness Regression for Hispanics Using AT-SAT Battery Score and Composite Criterion ......... 76
Fairness Regression for Females Using AT-SAT Battery Score and Composite Criterion ............ 77
Confidence Intervals for the Slopes in the Fairness Regressions ................................................. 78
Expected Score Frequency by Applicant Group ......................................................................... 79
Percent Passing by Recruitment Strategy .................................................................................... 80
Tables
Table 4.1.
Table 4.2.
Table 4.3.
Table 4.4.
Table 4.5.
Table 4.6.
Table 4.7.
Table 4.8.
Table 4.9.
Table 4.10.
Table 4.11.
Table 4.12.
Table 4.13.
Table 4.14.
Table 4.15.
Table 4.16.
Table 4.17.
Table 4.18.
Table 4.19.
Table 4.20.
Table 5.2.1.
Table 5.2.2.
Table 5.2.3.
Table 5.4.1.
Table 5.4.2.
Table 5.4.3.
Table 5.4.4.
Table 5.4.5.
Table 5.4.6.
Table 5.4.7.
Table 5.4.8
Table 5.4.9.
Table 5.4.10.
Table 5.4.11.
Table 5.4.12.
Table 5.4.13.
Table 5.4.14.
Table 5.4.15.
Table 5.4.16.
Table 5.4.17.
Table 5.4.18.
Table 5.4.19.
Table 5.4.20.
Table 5.4.21.
CBPM Development and Scaling Participants: Biographical Information ................................. 81

CBPM Scaling Workshops: Interrater Reliability Results ........................................................... 82
Performance Categories for Behavior Summary Scales ............................................................... 83
Pilot Test Results: Computer-Based Performance Measure (CBPM)
Distribution of Scores ................................................................................................................ 84
Pilot Test Results: Means and Standard Deviations for Ratings on Each Dimension ................. 85
Pilot Test Results: Interrater Reliabilities for Ratings ................................................................................................................. 85
HFPM Pilot Test Results - Correlations Between Ratings for Rater Pairs (Collapsed Across
Ratee) Both Across All Scenarios and Within Each Scenario ..................................................... 87
Rater-Ratee Assignments ........................................................................................................... 88
Computer-Based Performance Measure (CBPM): Distribution of Scores in Validation Sample 88
Number and Percentage of Supervisor Ratings at Each Scale Point in the Validation Sample .... 88
Number and Percentage of Peer Ratings at Each Scale Point in the Validation Sample .............. 89
Interrater Reliabilities for Peer, Supervisor and Combined Ratings ............................................ 89
Means and Standard Deviations for Mean Ratings on Each Dimension .................................... 90
Correlations Between Rating Dimensions for Peers and Supervisors .......................................... 91
Factor Analysis Results for Performance Ratings ........................................................................ 92
Descriptive Statistics of High Fidelity Performance Measure Criterion Variables ....................... 92
Interrater Reliabilities for OTS Ratings (N=24) ......................................................................... 93
Principal Components Analysis of the High Fidelity Criterion Space ........................................ 93
Intercorrelations Between Proposed Criterion Scores ................................................................. 95
Job Analysis-Item Linkage Task Results for CBPM and HFPM ................................................. 96
1990-1992 Profile Analysis of Actual FAA ATCS Applicants ..................................................... 97
Bureau of Census Data for Race/Ethnicity ................................................................................ 98
Background Characteristics by Testing Samples ......................................................................... 98
Ethnicity and Gender of all Participants .................................................................................... 99
Educational Background of All Participants............................................................................... 99
Data Collection Locations for All Participants ........................................................................... 99
Ethnicity and Gender of Air Traffic Controllers ....................................................................... 101
Air Traffic Controller Sample Educational Background ........................................................... 102
Air Traffic Controller Sample from Participating Locations ..................................................... 102
Air Traffic Controller Sample Time in Current Position .......................................................... 102
Air Traffic Controller Sample Job Experience at any Facility .................................................... 103
Ethnicity and Gender of Pseudo-Applicant Sample ................................................................. 103
CUE-Plus Scale Item Means and Frequencies .......................................................................... 105
CUE-Plus Means and Standard Deviations by Sample ............................................................ 106
Inter-Correlations of CUE-Plus Items ..................................................................................... 107
Item-Total Statistics for CUE-Plus: All Respondents ............................................................... 109
Varimax and Oblique Rotated Factor Patterns (CUE-Plus) .................................................... 110
Eigenvalues and Variance (CUE-Plus) ..................................................................................... 110
CUE-Plus Means, S.D. and d-Score for Gender ...................................................................... 111
Means, S.D. and d-Score for Ethnicity .................................................................................... 111
Correlations between CUE-Plus and Predictor Battery: Controllers ........................................ 112
Correlations between CUE-Plus and Predictor Battery: Controllers ........................................ 113
Correlations between CUE-Plus and Predictor Battery: Pseudo Applicants ............................. 114
Correlations between CUE-Plus and Predictor Battery: Pseudo Applicants ............................. 115
iv
#:375
Table 5.4.22.
Table 5.4.23.
Table 5.4.24.
Table 5.4.25.
Table 5.4.26.
Table 5.4.27.
Table 5.4.28.
Table 5.4.29.
Table 5.4.30.
Table 5.4.31.
Table 5.4.32.
Table 5.5.1.
Table 5.5.2.
Table 5.5.3.
Table 5.5.4.
Table 5.5.5.
Table 5.5.6.
Table 5.6.1.
Table 5.6.2.
Table 5.6.3.
Table 5.6.4.
Table 5.6.5.
Table 5.6.6.
Table 5.6.7.
Table 5.6.8.
Table 5.6.9.
Table 6.1.
Table 6.2.
Table 6.3.
Table 6.4.
Table 6.5.
Table 6.6.
Table 6.7.
Table 6.8.
Table 6.9.
Table 6.10.
Table 6.11.
Determinants of Applied Math Test:: No. of Items Correct ..................................................... 116

Determinants of Angles Test: No. of Items Correct.................................................................. 116
Determinants of Air Traffic Scenarios: Efficiency..................................................................... 116
Determinants of Air Traffic Scenarios: Safety ........................................................................... 117
Determinants of Air Traffic Scenarios: Procedural Accuracy .................................................... 117
Determinants of Analogy: Information Processing ................................................................... 117
Determinants of Analogy Test: Reasoning................................................................................ 118
Determinants of Dials Test: No. of Items Correct .................................................................... 118
Determinants of Letter Factory Test: Situational Awareness ..................................................... 118
Determinants of Letter Factory Test: Planning & Thinking Ahead .......................................... 119
Determinants of Scan Test: Total Score .................................................................................... 119
Simple Validities: Correlations Between Predictor Scores and Criteria ..................................... 120
Incremental Validities: Increases in Validities when Adding a Scale or Test .............................. 122
Comparison of Five Predictor Weighting Methods .................................................................. 123
Validity Coefficients for the Predictor Composite.................................................................... 124
Effect of Cut Score on Predicted Controller Performance ........................................................ 125
Expected Performance by Validity and Selectivity .................................................................... 126
Means for All Scales by Sample, Gender, and Race .................................................................. 127
Standard Deviations for All Scales by Sample, Gender, and Race ............................................. 128
Sample Sizes for All Scales by Sample, Gender, and Race ......................................................... 129
Frequency Table for Chi-Square Test of Association for Predictor Composite .......................... 131
Group Differences in Means and Passing Rates for the Pseudo-Applicants .............................. 131
Fairness Analysis Results .......................................................................................................... 133
Criterion d-Scores Analyses for Controllers ............................................................................. 135
Power Analysis of Fairness Regressions ..................................................................................... 136
Potential Impact of Targeted Recruitment ............................................................................... 136
Correlations Between Archival and AT-SAT Criterion Measures (N=669) ............................... 137
Correlations of Archival Selection Procedures with Archival and AT-SAT Criterion Measures 138
Correlations of Archival Selection Procedure Components with Archival and AT-SAT
Criterion Measures (N=212) .................................................................................................... 139
Correlations of Criterion Measures from High Fidelity Simulation with Archival
Performance-Based Predictors and Criterion Measures. ........................................................... 141
Correlations Between OPM Selection Tests and AT-SAT PredictorTests (N=561). .................. 142
Correlations of AT-SAT Applied Math, Angles, and Dials tests with Archival Dial Reading,
Directional Headings, Math Aptitude Tests, & H.S. Math Grades Biographical Item. ............ 143
Correlation of the Version of Air Traffic Scenarios Test Used in Pre-Training Screen
Validation with the Version of Air Traffic Scenarios Test Used in AT-SAT Validation .............. 143
Oblique Principal Components Analysis of EQ Scales ............................................................ 144
Description of 16PF Scales. ..................................................................................................... 145
Correlation of EQ and 16PF Scales ......................................................................................... 147
Results of Multiple linear Regression of OPM Rating, Final Score in Nonradar Screen Program,
and AT-SAT Predictor Tests on AT-SAT Composite Criterion Measure (N=586) .................... 148
Appendices:
Appendix C - Criterion Assessment Scales ....................................................................................................... C1
Appendix D - Rater Training Script .................................................................................................................D1
Appendix E - AT-SAT High Fidelity Simulation Over the Shoulder (OTS) Rating Form ................................ E1
Appendix F - Behavioral and Event Checklist ...................................................................................................F1
Appendix G - AT-SAT High Fidelity Standardization Guide ........................................................................... G1
Appendix H - Pilot Test Rater Comparisons ................................................................................................... H1
Appendix I - Sample Cover Letter and Table to Assess the Completeness of Data Transmissions ...................... I1
#:376
CHAPTER 4
DEVELOPMENT
OF
CRITERION MEASURES
OF
AIR TRAFFIC CONTROLLER PERFORMANCE
Walter C. Borman, Jerry W. Hedge, Mary Ann Hanson, Kenneth T. Bruskiewicz

Personnel Decisions Research Institutes, Inc.
Henry Mogilka and Carol Manning
Laura B. Bunch and Kristen E. Horgen
University of South Florida and
Personnel Decisions Research Institutes, Inc.
INTRODUCTION
controller performance (e.g., Buckley, OConnor, &

Beebe, 1969; Cobb, 1967), we began to formulate ideas
for the criterion measures. Hedge et al. (1993) discuss
literature that was reviewed in formulating this plan, and
summarize an earlier version of the criterion plan. Basically, this plan was to develop multiple measures of
controller performance. Each of these measures has
strengths for measuring performance, as well as certain
limitations. However, taken together, we believe the
measures will provide a valid depiction of each controllers
job performance. The plan involved developing a special
situational judgment test (called the Computer-Based
Performance Measure, or CBPM) to represent the maximum performance/technical proficiency part of the job
and behavior-based rating scales to reflect typical performance. A high-fidelity air traffic control test (the High
Fidelity Performance Measure, HFPM) was also to be
developed to investigate the construct validity of the
lower fidelity CBPM with a subset of the controllers who
were administered the HFPM.
An important element of the AT-SAT predictor development and validation project is criterion performance measurement. To obtain an accurate picture of
the experimental predictor tests validity for predicting
controller performance, it is important to have reliable
and valid measures of controller job performance. That
is, a concurrent validation study involves correlating
predictor scores for controllers in the validation sample
with criterion performance scores. If these performance
scores are not reliable and valid, our inferences about
predictor test validities are likely to be incorrect.
The job of air traffic controller is very complex and
potentially difficult to capture in a criterion development effort. Yet, the goal here was to develop criterion
measures that would provide a comprehensive picture of
controller job performance.
Initial job analysis work suggested a model of performance that included both maximum and typical performance (Bobko, Nickels, Blair & Tartak, 1994; Nickels,
Bobko, Blair, Sands, & Tartak, 1995). More so than
with many jobs, maximum can-do performance is very
important in controlling air traffic. There are times on
this job when the most important consideration is maximum performance - does the controller have the technical skill to keep aircraft separated under very difficult
conditions? Nonetheless, typical performance over time
is also important for this job.
Based on a task-based job analysis (Nickels et al.,
1995), a critical incidents study (Hedge, Borman,
Hanson, Carter & Nelson, 1993), and past research on
The Computer Based Performance Measure

(CBPM)
The goal in developing the CBPM was to provide a
relatively practical, economical measure of technical
proficiency that could be administered to the entire
concurrent validation sample. Practical constraints limited the administration of the higher fidelity measure
(HFPM) to a subset of the validation sample.
Previous research conducted by Buckley and Beebe
(1972) suggested that scores on a lower fidelity simulation are likely to correlate with scores on a real time,
#:377
hands-on simulation and also with performance ratings

provided by peers and supervisors. Their motion picture
or CODE test, presented controllers with a motion
picture of a radar screen and asked them to note when
there were potential conflictions. Buckley and Beebe
reported significant correlations between CODE scores
and for-research-only ratings provided by the controllers peers, but the sample size in this research was only
19. Buckley, OConnor, and Beebe (1969) also reported
that correlations between CODE scores and scores on a
higher-fidelity simulation were substantial, the highest
correlation was .73, but, again, the sample size was very
small. Finally, Milne and Colmen (1972) found a
substantial correlation between the CODE test and forresearch-only job performance ratings. In general, results for the CODE test suggest that a lower-fidelity
simulation can capture important air traffic controller
judgment and decision-making skills.
Again, the intention in the present effort was to
develop a computerized performance test that as closely
as possible assessed the critical technical proficiency,
separating-aircraft part of the controller job. Thus, the
target performance constructs included judgment and
decision making in handling air traffic scenarios, procedural knowledge about how to do technical tasks, and
confliction prediction; i.e., the ability to know when
a confliction is likely to occur sometime in the near
future if nothing is done to address the situation.
The CBPM was patterned after the situational judgment test method. The basic idea was to have an air
traffic scenario appear on the computer screen, allow a
little time for the problem to evolve, and then freeze the
screen and ask the examinee a multiple choice question
about how to respond to the problem. To develop this
test, we trained three experienced controllers on the
situational judgment test method and elicited initial
ideas about applying the method to the air traffic
context.
The first issue in developing this test was the airspace
in which the test would be staged. There is a great deal
of controller job knowledge that is unique to controlling
traffic in a specific airspace (e.g., the map, local obstructions). Each controller is trained and certified on the
sectors of airspace where he or she works. Our goal in
designing the CBPM airspace was to include a set of
airspace features (e.g., flight paths, airports, special use
airspace) sufficiently complicated to allow for development of difficult, realistic situations or problems, but to
also keep the airspace relatively simple because it is
important that controllers who take the CBPM can
learn these features very quickly. Figure 4.1 shows the

map of the CBPM airspace, and Figure 4.2 is a summary
of important features of this airspace that do not appear
on the map.
After the airspace was designed, the three air traffic
controller subject matter experts (SMEs) were provided
with detailed instructions concerning the types of scenarios and questions appropriate for this type of test.
These SMEs then developed several air traffic scenarios
on paper and multiple choice items for each scenario.
The plan was to generate many more items than were
needed on the final test, and then select a subset of the
best items later in the test development process. Also,
based on the job analysis (Nickels et al., 1995) a list of
the 40 most critical en route controller tasks was available, and one primary goal in item development was to
measure performance in as many of these tasks as
possible, especially those that were rated most critical.
At this stage, each scenario included a map depicting
the position of each aircraft at the beginning of the
scenario, flight strips that provided detailed information about each aircraft (e.g., the intended route of
flight), a status information area (describing weather
and other pertinent background information), and a
script describing how the scenario would unfold. This
script included the timing and content of voice communications from pilots and/or controllers, radar screen
updates (which occur every 10 seconds in the en route
environment), other events (e.g., hand-offs, the appearance of unidentified radar targets, emergencies), and the
exact timing and wording of each multiple choice
question (along with possible responses).
After the controllers had independently generated a
large number of scenarios and items, we conducted
discussion sessions in which each SME presented his
scenarios and items, and then the SMEs and researchers
discussed and evaluated these items. Discussion included topics such as whether all necessary information
was included, whether the distractors were plausible,
whether or not there were correct or at least better
responses, whether the item was too tricky (i.e., choosing the most effective response did not reflect an important skill), or too easy (i.e., the correct response was
obvious), and whether the item was fair for all facilities
(e.g., might the item be answered differently at different
facilities because of different policies or procedures?). As
mentioned previously, the CBPM was patterned after
the situational judgment test approach. Unlike other
multiple choice tests, there was not necessarily only one
correct answer, with all the others being wrong. Some
#:378
items had, for example, one best answer and one or two
others that represented fairly effective responses. These
test development sessions resulted in 30 scenarios and
99 items, with between 2 and 6 items per scenario.
An initial version of the test was then programmed to
run on a standard personal computer with a 17-inch
high-resolution monitor. This large monitor was needed
to realistically depict the display as it would appear on
an en route radar screen. The scenarios were initially
programmed using a radar engine, which had previously been developed for the FAA for training purposes.
This program was designed to realistically display airspace features and the movement of aircraft. After the
scenarios were programmed into the radar engine, the
SMEs watched the scenarios evolve and made modifications as necessary to meet the measurement goals. Once
realistic positioning and movement of the aircraft had
been achieved, the test itself was programmed using
Authorware. This program presented the radar screens,
voice communications, and multiple choice questions,
and also it collected the multiple choice responses.
Thus, the CBPM is essentially self-administering
and runs off a CD-ROM. The flight strips and status
information areas are compiled into a booklet, with one
page per scenario, and the airspace summary and sector
map (see Figures 4.1 and 4.2) are displayed near the
computer when the test is administered. During test
administration, controllers are given 60 seconds to
review each scenario before it begins. During this time,
the frozen radar display appears on the screen, and
examinees are allowed to review the flight strips and any
other information they believe is relevant to that particular scenario (e.g., the map or airspace summary).
Once the test items have been presented, they are given
25 seconds to answer the question. This is analogous to
the controller job, where they are expected to get the
picture concerning what is going on in their sector of
airspace, and then are sometimes required to react
quickly to evolving situations. We also prepared a
training module to familiarize examinees with the airspace and instructions concerning how to take the test.
After preparing these materials, we gathered a panel
of four experienced controllers who were teaching at the
FAA Academy and another panel of five experienced
controllers from the field to review the scenarios and
items. Specifically, each of these groups was briefed
regarding the project, trained on the airspace, and then
shown each of the scenarios and items. Their task was to
rate the effectiveness level of each response option.
Ratings were made independently on a 1-7 scale. Table
4.1 describes the controllers who participated in this

initial scaling workshop, and Table 4.2 summarizes the
intraclass correlation, interrater agreement across items
for the two groups. After this initial rating session with
each of the groups, the panel members compared their
independent ratings and discussed discrepancies. In
general, two different outcomes occurred as a result of
these discussions. In some cases, one or two SMEs failed
to notice or misinterpreted part of the item (e.g., did not
examine an important flight strip). For these cases, no
changes were generally made to the item. In other cases,
there was a legitimate disagreement about the effectiveness of one or more response options. Here, we typically
discussed revisions to the item or the scenario itself that
would lead to agreement between panel members (without making the item overly transparent). In addition,
discussions with the first group indicated that several
items were too easy (i.e., the answer was obvious). These
items were revised to be less obvious. Five items were
dropped because they could not be satisfactorily revised.
These ratings and subsequent discussions resulted in
substantial revisions to the CBPM. The revisions were
accomplished in preparation for a final review of the
CBPM by a panel of expert SMEs. For this final review
session, 12 controllers from the field were identified
who had extensive experience as controllers and had
spent time as either trainers or supervisors. Characteristics
of this final scaling panel group are shown in Table 4.1.
The final panel was also briefed on the project and the
CBPM and then reviewed each item. To ensure that
they used all of the important information in making
their ratings, short briefings were prepared for each
item, highlighting the most important pieces of information that affected the effectiveness of the various
responses. Each member of the panel then independently rated the effectiveness level of each response
option. This group did not review each others ratings
or discuss the items.
Interrater agreement data appear in Table 4.2. These
results show great improvement because in the final
scaling of the CBPM, 80 of the 94 items have interrater
reliability. As a result of the review, 5 items were
dropped because there was considerable disagreement
among raters. These final scaling data were used to score
the CBPM. For each item, examinees were assigned the
mean effectiveness of the response option they chose,
with a few exceptions. First, for the knowledge items,
there was only one correct response. Similarly, for the
confliction prediction items, there was one correct
response. In addition, it is more effective to predict a
#:379
confliction when there is not one (i.e., be conservative)

than to fail to predict a confliction when there is one.
Thus, a higher score was assigned for an incorrect
conservative response than an incorrect response that
predicted no confliction when one would have occurred. The controller SMEs generated rational keys for
23 knowledge and confliction prediction type items.
Figure 4.3 shows an example of a CBPM item. One final
revision of the CBPM was made based on pilot test data.
The pilot test will be discussed in a later section.
what different behavioral content. Because AT-SAT

focused on en route controllers, we limit our discussion
to scale development for that group.
The next step was to retranslate the performance
examples. This required controller SMEs to make two
judgments for each example. First, they assigned each
performance example to one (and only one) performance category. Second, the controllers rated the level
of effectiveness (from 1 = very ineffective to 7 = very
effective) of each performance example.
Thus, we assembled the ten performance categories
and 708 performance examples into four separate booklets that were used to collect the SME judgments just
discussed. In all, booklets were administered to 47 en
route controllers at three sites within the continental
United States. Because each booklet required 2-3 hours
to complete, each of the SMEs was asked to complete
only one booklet. As a result, each performance example
or item was evaluated by 9 to 20 controllers.
Results of the retranslation showed that 261 examples were relevant to the en route option, were sorted
into a single dimension more than 60% of the time, and
had standard deviations of less than 1.50 for the effectiveness ratings. These examples were judged as providing unambiguous behavioral performance information
with respect to both dimension and effectiveness level.
Then for each of the ten dimensions, the performance examples belonging to that dimension were
further divided into high effectiveness (retranslated at 5
to 7), middle effectiveness (3 to 5), and low effectiveness
(1-3). Behavior summary statements were written to
summarize all of the behavioral information reflected in
the individual examples. In particular, two or occasionally three behavior statements for each dimension and
effectiveness level (i.e., high, medium, or low) were
generated from the examples. Additional rationale for
this behavior summary scale method can be found in
Borman (1979).
As a final check on the behavior summary statements,
we conducted a retranslation of the statements using the
same procedure as was used with the individual examples. Seventeen en route controllers sorted each of the
87 statements into one of the dimensions and rated the
effectiveness level reflected on a 1-7 scale. Results of this
retranslation can be found in Pulakos, Keichel,
Plamondon, Hanson, Hedge, and Borman (1996). Finally, for those statements either sorted into the wrong
dimensions by 40% or more of the controllers or retranslated at an overly high or low effectiveness level, we
The Behavior Summary Scales

The intention here was to develop behavior-based
rating scales that would encourage raters to make evaluations as objectively as possible. An approach to accomplish this is to prepare scales with behavioral statements
anchoring different effectiveness levels on each dimension so that the rating task is to compare observed ratee
behavior with behavior on the scale. This matching
process should be more objective than, for example,
using a 1 = very ineffective to 7 = very effective scale. A
second part of this approach is to orient and train raters
to use the behavioral statements in the manner intended.
The first step in scale development was to conduct
workshops to gather examples of effective, mid-range,
and ineffective controller performance. Four such workshops proceeded with controllers teaching at the FAA
academy and with controllers at the Minneapolis Center. A total of 73 controllers participated in the workshops; they generated 708 performance examples.
We then analyzed these performance examples and
tentatively identified eight relevant performance categories: (1) Teamwork, (2) Coordinating, (3) Communicating, (4) Monitoring, (5) Planning/Prioritizing, (6)
Separation, (7) Sequencing/Preventing Delays, and (8)
Reacting to Emergencies. Preliminary definitions were
developed for these categories. A series of five miniworkshops were subsequently held with controllers to
review the categories and definitions. This iterative
process, involving 24 controllers, refined our set of
performance categories and definitions. The end result
was a set of ten performance categories. These final
categories and their definitions are shown in Table 4.3.
Interestingly, scale development work to this point
resulted in the conclusion that these ten dimensions
were relevant for all three controller options: tower cab,
TRACON, and en route. However, subsequent work
with tower cab controllers resulted in scales with some-
#:380
made revisions based on our analysis of the likely reason

for the retranslation problem. The final behavior summary scales appear in Appendix C.
Regarding the rater orientation and training program, our experience and previous research has shown
that the quality of performance ratings can be improved
with appropriate rater training (e.g., Pulakos, 1984,
1986; Pulakos & Borman, 1986). Over the past several
years, we have been refining a training strategy that (1)
orients raters to the rating task and why the project
requires accurate evaluations; (2) familiarizes raters with
the rating dimensions and how each is defined; (3)
teaches raters how to most effectively use the behavior
summary statements to make objective ratings; (4)
describes certain rater errors (e.g., halo) in simple,
common-sense terms and asks raters to avoid them; and
finally (5) encourages raters to be as accurate as possible
in their evaluations.
For this application, we revised the orientation and
training program to encourage accurate ratings in this
setting. In particular, a script was prepared to be used by
persons administering the rating scales in the field.
Appendix D contains the script. In addition, a plan for
gathering rating data was created. Discussions with
controllers in the workshops described earlier suggested
that both supervisors and peers (i.e., fellow controllers)
would be appropriate rating sources. Because gathering
ratings from relatively large numbers of raters per ratee
is advantageous to increase levels of interrater reliability,
we requested that two supervisor and two peer raters be
asked to contribute ratings for each controller ratee in
the study. Supervisor and peer raters were identified
who had worked in the same area as a controller for at
least 6 months and were very familiar with their job
performance. For practical reasons we set a limit of 5-6
controllers to be rated by any individual rater in the
research. The rater orientation and training program
and the plan for administering the ratings in the field
were incorporated into a training module for those professionals selected to conduct the data collection. That training session is described in a subsequent section.
sponse fidelity (Tucker, 1984). Simulator studies of

ATC problems have been reported in the literature since
the 1950s. Most of the early research was directed
toward the evaluation of effects of workload variables
and changes in control procedures on overall system performance, rather than focused on individual performance
assessment (Boone, Van Buskirk, and Steen, 1980).
However, there have been some research and development efforts aimed at capturing the performance of
air traffic controllers, including Buckley, OConnor,
Beebe, Adams, and MacDonald (1969), Buckley,
DeBaryshe, Hitchner, and Kohn (1983), and
Sollenberger, Stein, and Gromelski (1997). For example, in the Buckley et al. (1983) study, trained
observers ratings of simulator performance were found
highly related to various aircraft safety and expeditiousness measures. Full-scale dynamic simulation allows the
controller to direct the activities of a sample of simulated
air traffic, performing characteristic functions such as
ordering changes in aircraft speed or flight path, but within
a relatively standardized work sample framework.
The intention of the HFPM was to provide an
environment that would, as nearly as possible, simulate
actual conditions existing in the controllers job. One
possibility considered was to test each controller working in his or her own facilitys airspace. This approach
was eventually rejected, however, because of the problem of unequal difficulty levels across facilities and even
across sectors within a facility (Borman, Hedge, &
Hanson, 1992; Hanson, Hedge, Borman, & Nelson,
1993; Hedge, Borman, Hanson, Carter, & Nelson,
1993). Comparing the performance of controllers working in environments with unequal (and even unknown)
difficulty levels is extremely problematic. Therefore, we
envisioned that performance could be assessed using a
simulated air traffic environment. This approach was
feasible because of the availability at the FAA Academy
of several training laboratories equipped with radar
stations similar to those found in the field. In addition,
they use a generic airspace (Aero Center) designed to
allow presentation of typical air traffic scenarios that
must be controlled by the trainee (or in our case, the
ratee). Use of a generic airspace also allowed for standardization of assessment. See Figure 4.4 for a visual
depiction of the Aero Center airspace.
Thus, through use of the Academys radar training
facility (RTF) equipment, in conjunction with the Aero
Center generic airspace, we were able to provide a test
environment affording the potential for both high stimulus and response fidelity. Our developmental efforts
The High-Fidelity Performance Measure (HFPM)

Measuring the job performance of air traffic controllers is a unique situation where reliance on a work
sample methodology may be especially applicable. Use
of a computer-generated simulation can create an ATC
environment that allows the controller to perform in a
realistic setting. Such a simulation approach allows the
researcher to provide high levels of stimulus and re-
#:381
focused, then, on: (1) designing and programming

specific scenarios in which the controllers would control
air traffic; and (2) developing measurement tools for
evaluating controller performance.
It was decided that controller performance should be

evaluated across broad dimensions of performance, as
well as at a more detailed step-by-step level. Potential
performance dimensions for a set of rating scales were
identified through reviews of previous literature involving air traffic controllers, existing on-the-job-training
forms, performance verification forms, and current ATSAT work on the development of behavior summary
scales. The over-the-shoulder (OTS) nature of this
evaluation process, coupled with the maximal performance focus of the high-fidelity simulation environment, required the development of rating instruments
designed to facilitate efficient observation and evaluation of performance.
After examining several possible scale formats, we
chose a 7-point effectiveness scale for the OTS form,
with the scale points clustered into three primary effectiveness levels; i.e., below average (1 or 2), fully adequate
(3, 4, or 5), and exceptional (6 or 7). Through consultation with controllers currently working as Academy
instructors, we tentatively identified eight performance
dimensions and developed behavioral descriptors for
these dimensions to help provide a frame-of-reference
for the raters. The eight dimensions were: (1) Maintaining Separation; (2) Maintaining Efficient Air Traffic
Flow; (3) Maintaining Attention and Situation Awareness; (4) Communicating Clearly, Accurately, and Concisely; (5) Facilitating Information Flow; (6)
Coordinating; (7) Performing Multiple Tasks; and, (8)
Managing Sector Workload. We also included an overall performance category. As a result of rater feedback
subsequent to pilot testing (described later in this chapter), Facilitating Information Flow was dropped from
the form. This was due primarily to perceived overlap
between this dimension and several others, including
Dimensions 3, 4, 6, and 7. The OTS form can be found
in Appendix E.
A second instrument required the raters to focus on
more detailed behaviors and activities, and note whether
and how often each occurred. A Behavioral and Events
Checklist (BEC) was developed for use with each
scenario. The BEC required raters to actively observe
the ratees controlling traffic during each scenario and
note behaviors such as: (1) failure to accept hand-offs,
coordinate pilot requests, etc.; (2) letters of agreement
(LOA)/directive violations; (3) readback/hearback errors; (4) unnecessary delays; (5) incorrect information
input into the computer; and, (6) late frequency changes.
Raters also noted operational errors and deviations. The
BEC form can be found in Appendix F.
Scenario Development
The air traffic scenarios were designed to incorporate
performance constructs central to the controllers job,
such as maintaining aircraft separation, coordinating,
communicating, and maintaining situation awareness.
Also, attention was paid to representing in the scenarios
the most important tasks from the task-based job analysis. Finally, it was decided that, to obtain variability in
controller performance, scenarios should be developed
with either moderate or quite busy traffic conditions.
Thus, to develop our HFPM scenarios, we started with
a number of pre-existing Aero Center training scenarios,
and revised and reprogrammed to the extent necessary
to include relevant tasks and performance requirements
with moderate- to high-intensity traffic scenarios. In all,
16 scenarios were developed, each designed to run no
more than 60 minutes, inclusive of start-up, position relief
briefing, active air traffic control, debrief, and performance
evaluation. Consequently, active manipulation of air traffic was limited to approximately 30 minutes.
The development of a research design that would
allow sufficient time for both training and evaluation
was critical to the development of scenarios and accurate
evaluation of controller performance. Sufficient training time was necessary to ensure adequate familiarity
with the airspace, thereby eliminating differential knowledge of the airspace as a contributing factor to controller
performance. Adequate testing time was important to
ensure sufficient opportunity to capture controller performance and allow for stability of evaluation. A final
consideration, of course, was the need for controllers in
our sample to travel to Oklahoma City to be trained and
evaluated. With these criteria in mind, we arrived at a
design that called for one-and one-half days of training,
followed by one full day of performance evaluation.
This schedule allowed us to train and evaluate two
groups of ratees per week.
Development of Measurement Instruments
High-fidelity performance data were captured by
means of behavior-based rating scales and checklists,
using trainers with considerable air traffic controller
experience or current controllers as raters. Development
and implementation of these instruments, and selection
and training of the HFPM raters are discussed below.
#:382
Rater Training
Fourteen highly experienced controllers from field
units or currently working as instructors at the FAA
Academy were detailed to the AT-SAT project to serve
as raters for the HFPM portion of the project. Raters
arrived approximately three weeks before the start of
data collection to allow time for adequate training and
pilot testing. Thus, our rater training occurred over an
extended period of time, affording an opportunity for
ensuring high levels of rater calibration.
During their first week at the Academy, raters were
exposed to (1) a general orientation to the AT-SAT
project, its purposes and objectives, and the importance
of the high-fidelity component; (2) airspace training;
(3) the HFPM instruments; (4) all supporting materials
(such as Letters of Agreement, etc.); (5) training and
evaluation scenarios; and (6) rating processes and procedures. The training program was an extremely handson, feedback intensive process. During this first week
raters served as both raters and ratees, controlling traffic
in each scenario multiple times, as well as serving as
raters of their associates who took turns as ratees. This
process allowed raters to become extremely familiar
with both the scenarios and evaluation of performance
in these scenarios. With multiple raters evaluating performance in each scenario, project personnel were able
to provide immediate critique and feedback to raters,
aimed at improving accuracy and consistency of rater
observation and evaluation.
In addition, prior to rater training, we scripted
performances on several scenarios, such that deliberate
errors were made at various points by the individual
controlling traffic. Raters were exposed to these scripted
scenarios early in the training so as to more easily
facilitate discussion of specific types of controlling
errors. A standardization guide was developed with the
cooperation of the raters, such that rules for how observed behaviors were to be evaluated could be referred
to during data collection if any questions arose (see
Appendix G). All of these activities contributed to
enhanced rater calibration.
to the pilot test sites. In general, procedures for administering these two assessment measures proved to be
effective. Data were gathered on a total of 77 controllers
at the two locations. Test administrators asked pilot test
participants for their reactions to the CBPM, and many
of them reported that the situations were realistic and
like those that occurred on their jobs.
Results for the CBPM are presented in Table 4.4.
The distribution of total scores was promising in the
sense that there was variability in the scores. The coefficient alpha was moderate, as we might expect from a
test that is likely mutidimensional. Results for the
ratings are shown in Tables 4.5 and 4.6. First, we were
able to approach our target of two supervisors and two
peers for each ratee. A mean of 1.24 supervisors and 1.30
peers per ratee participated in the rating program. In
addition, both the supervisor and peer ratings had
reasonable degrees of variability. Also, the interrater
reliabilities (intraclass correlations) were, in general,
acceptable. The Coordinating dimension is an exception. When interrater reliabilities were computed across
the supervisor and peer sources, they ranged from .37 to
.62 with a median of .54. Thus, reliability improves
when both sources data are used.
In reaction to the pilot test experience, we modified
the script for the rater orientation and training program.
We decided to retain the Coordinating dimension for
the main study, with the plan that if reliability continued to be low we might not use the data for that
dimension. With the CBPM, one item was dropped
because it had a negative item-total score correlation.
That is, controllers who answered this item correctly
tended to have low total CBPM scores.
The primary purpose of the HFPM pilot test was to
determine whether our rigorous schedule of one-and
one-half days of training and one day of evaluation was
feasible administratively. Our admittedly ambitious
design required completion of up to eight practice
scenarios and eight graded scenarios. Start-up and shutdown of each computer-generated scenario at each radar
station, setup and breakdown of associated flight strips,
pre-and post-position relief briefings, and completion
of OTS ratings and checklists all had to be accomplished
within the allotted time, for all training and evaluation
scenarios. Thus, smooth coordination and timing of
activities was essential. Prior to the pilot test, preliminary dry runs had already convinced us to eliminate
one of the eight available evaluation scenarios, due to
time constraints.
Pilot Tests of the Performance Measures

The plan was to pilot test the CBPM and the performance rating program at two Air Route Traffic Control
Centers (ARTCCs), Seattle and Salt Lake City. The
HFPM was to be pilot tested in Oklahoma City. All
materials were prepared for administration of the CBPM
and ratings, and two criterion research teams proceeded
#:383
Six experienced controllers currently employed as

instructors at the FAA Academy served as our ratees for
the pilot test. They were administered the entire twoand one-half day training/evaluation process, from orientation through final evaluation scenarios. As a result
of the pilot test, and in an effort to increase the efficiency
of the process, minor revisions were made to general
administrative procedures. However, in general, procedures for administering the HFPM proved to be effective; all anticipated training and evaluation requirements
were completed on time and without major problems.
In addition to this logistical, administration focus of
the pilot test, we also examined the consistency of
ratings by our HFPM raters. Two raters were assigned
to each ratee, and the collection of HFPM data by two
raters for each ratee across each of the seven scenarios
allowed us to check for rater or scenario peculiarities.
Table 4.7 presents correlations between ratings for
rater pairs both across scenarios and within each scenario, and suggested that Scenarios 2 and 7 should be
examined more closely, as well as three OTS dimensions
(Communicating Clearly, Accurately, and Efficiently;
Facilitating Information Flow; and Coordination). To
provide additional detail, we also generated a table
showing magnitude of effectiveness level differences
between each rater pair for each dimension on each
scenario (see Appendix H).
Examination of these data and discussion with our
raters helped us to focus on behaviors or activities in the
two scenarios that led to ambiguous ratings and to
subsequently clarify these situations. Discussions concerning these details with the raters also allowed us to
identify specific raters in need of more training. Finally,
extensive discussion surrounding the reasons for lower
than expected correlations on the three dimensions
generated the conclusion that excessive overlap between
the three dimensions generated confusion as to where to
represent the observed performance. As a result, the
Facilitating Information Flow dimension was dropped
from the OTS form.
mid-scenario). Test site managers had an opportunity to

practice setting up the testing stations and review the
beginning portion of the test. They were also briefed on
the performance rating program. We described procedures for obtaining raters and training them. The script
for training raters was thoroughly reviewed and rationale for each element of the training was provided.
Finally, we answered all of the test site managers
questions. These test site managers hired and trained
data collection staff at their individual testing locations.
There were a total of 20 ARTCCs that participated in the
concurrent validation study (both Phase 1 and Phase 2).
Data Collection
CBPM data were collected for 1046 controllers.
Performance ratings for 1227 controllers were provided
by 535 supervisor and 1420 peer raters. Table 4.8 below
shows the number of supervisors and peers rating each
controller. CBPM and rating data were available for
1043 controllers.
HFPM data were collected for 107 controllers. This
sample was a subset of the main sample so 107 controllers had data for the CBPM, the ratings, and the HFPM.
In particular, controllers from the main sample arrived
in Oklahoma City from 12 different air traffic facilities
throughout the U.S. to participate in the two-and onehalf day HFPM process. The one-and one-half days of
training consisted of four primary activities: orientation, airspace familiarization and review, airspace certification testing, and scenarios practice. To accelerate
learning time, a hard copy and computer disk describing
the airspace had been developed and sent to controllers
at their home facility for preread prior to arrival in
Oklahoma City.
Each controller was then introduced to the Radar
Training Facility (RTF) and subsequently completed
two practice scenarios. After completion of the second
scenario and follow-up discussions about the experience, the controllers were required to take an airspace
certification test. The certification consisted of 70 recall
and recognition items designed to test knowledge of
Aero Center. Those individuals not receiving a passing
grade (at least 70% correct) were required to retest on
that portion of the test they did not pass. The 107
controllers scored an average of 94% on the test, with
only 7 failures (6.5%) on the first try. All controllers
subsequently passed the retest and were certified by the
trainers to advance to the remaining day of formal
evaluation.
Training the Test Site Managers

Our staff prepared a manual describing data collection procedures for the criterion measures during the
concurrent validation and conducted a half-day training session on how to collect criterion data in the main
sample. We reviewed the CBPM, discussed administration issues, and described procedures for handling problems (e.g., what to do when a computer malfunctions in
#:384
After successful completion of the air traffic test, each

controller received training on six additional air traffic
scenarios. During this time, the raters acted as trainers
and facilitated the ratees learning of the airspace. While
questions pertaining to knowledge of airspace and related regulations were answered by the raters, coaching
ratees on how to more effectively and efficiently control
traffic was prohibited.
After the eight training scenarios were completed, all
ratees performance was evaluated on each of seven
scenarios that together required approximately 8 hours
to complete. The seven scenarios consisted of four
moderately busy and three very busy air traffic conditions, increasing in complexity from Scenario 1 to
Scenario 7. During this 8 hour period of evaluation,
raters were randomly assigned to ratees before each
scenario, with the restriction that a rater should not be
assigned to a ratee (1) from the raters home facility; or
(2) if he/she was the raters training scenario assignment.
While ratees were controlling traffic in a particular
scenario, raters continually observed and noted performance using the BEC. After the scenario ended, each
rater completed the OTS ratings. In all, 11 training/
evaluation sessions were conducted within a 7-week
period. During four of these sessions, a total of 24 ratees
were evaluated by two raters at a time, while a single rater
evaluated ratee performance during the other seven
sessions.
the peer reliabilities, but the differences are for the most
part very small. Importantly, the combined supervisor/
peer ratings reliabilities are substantially higher than the
reliabilities for either source alone. Conceptually, it
seems appropriate to get both rating sources perspectives on controller performance. Supervisors typically
have more experience evaluating performance and have
seen more incumbents perform in the job; peers often
work side-by-side with the controllers they are rating,
and thus have good first-hand knowledge of their performance. The result of higher reliabilities for the combined ratings makes an even more convincing argument
for using both rating sources.
Scores for each ratee were created by computing the
mean peer and mean supervisor rating for each dimension. Scores across peer and supervisor ratings were also
computed for each ratee on each dimension by taking
the mean of the peer and supervisor scores. Table 4.13
presents the means and standard deviations for these
rating scores on each dimension, supervisors and peers
separately, and the two sources together. The means are
higher for the peers (range = 5.03-5.46), but the standard deviations for that rating source are generally
almost as high as those for the supervisor raters.
Table 4.14 presents the intercorrelations between
supervisor and peer ratings on all of the dimensions.
First, within rating source, the between-dimension correlations are large. This is common with rating data.
And second, the supervisor-peer correlations for the
same dimensions (e.g., Communicating = .39) are at
least moderate in size, again showing reasonable agreement across-source regarding the relative levels of effectiveness for the different controllers rated.
The combined supervisor/peer ratings were factor
analyzed to explore the dimensionality of the ratings.
This analysis addresses the question, is there a reasonable way of summarizing the 10 dimensions with a
smaller number of composite categories? The 3-factor
solution, shown in Table 4.15, proved to be the most
interpretable. The first factor was called Technical
Performance, with Dimensions, 1, 3, 6, 7, and 8 primarily defining the factor. Technical Effort was the label
for Factor 2, with Dimensions 2, 4, 5, and 9 as the
defining dimensions. Finally, Factor 3 was defined by a
single dimension and was called Teamwork.
Although the 3-factor solution was interpretable,
keeping the three criterion variables separate for the
validation analyses seemed problematic. This is because
(1) the variance accounted for by the factors is very
uneven (82% of the common variance is accounted for
Results
CBPM
Table 4.9 shows the distribution of CBPM scores. As
with the pilot sample, there is a reasonable amount of
variability. Also, item-total score correlations range
from .01 to .27 (mean = .11). The coefficient alpha was
.63 for this 84-item test. The relatively low item-total
correlations and the modest coefficient alpha suggest that
the CBPM is measuring more than a single construct.
Supervisor and Peer Ratings
In Tables 4.10 and 4.11, the number and percent of
ratings at each scale point are depicted for supervisors
and peers separately. A low but significant percentage of
ratings are at the 1, 2, or 3 level for both supervisor and
peer ratings. Most of the ratings fall at the 4-7 level, but
overall, the variability is reasonable for both sets of ratings.
Table 4.12 contains the interrater reliabilities for the
supervisor and peer ratings separately and for the two
sets of ratings combined. In general, the reliabilities are
quite high. The supervisor reliabilities are higher than
#:385
by the first factor); (2) the correlations between unitweighted composites representing the first two factors is
.78; correlations between each of these composites and
Teamwork are high as well (.60 and .63 respectively);
and (3) all but one of the 10 dimensions loads on a
technical performance factor, so it seemed somewhat
inappropriate to have the one-dimension Teamwork
variable representing 1/3 of the rating performance
domain.
Accordingly, we formed a single rating variable represented by a unit-weighted composite of ratings on the
10 dimensions. The interrater reliability of this composite is .71 for the combined supervisor and peer rating
data. This is higher than the reliabilities for individual
dimensions. This would be expected, but it is another
advantage of using this summary rating composite to
represent the rating data.
median interrater reliabilities ranging from a low of .83

for Maintaining Attention and Situation Awareness to a
high of .95 for Maintaining Separation. In addition,
these OTS dimensions were found to be highly
intercorrelated (median r = .91). Because of the high
levels of dimension intercorrelation, an overall composite will be used in future analyses.
All relevant variables for the OTS and BEC measures
were combined and subjected to an overall principal
components analysis to represent a final high-fidelity
performance criterion space. The resulting two- factor
solution is presented in Table 4.18. The first component, Overall Technical Proficiency, consists of the OTS
rating scales, plus the operational error, operational
deviation, and LOA/Directive violation variables from
the BEC. The second component is defined by six
additional BEC variables and represent a sector management component of controller performance. More specifically, this factor represents Poor Sector Management,
whereby the controllers more consistently make late
frequency changes, fail to accept hand-offs, commit
readback/hearback errors, fail to accommodate pilot
requests, delay aircraft unnecessarily, and enter incorrect information in the computer. This interpretation is
reinforced by the strong negative correlation (-.72)
found between Overall Technical Proficiency and Poor
Sector Management.
HFPM
Table 4.16 contains descriptive statistics for the
variables included in both of the rating instruments
used during the HFPM graded scenarios. For the OTS
dimensions and the BEC, the scores represent averages
across each of the seven graded scenarios.
The means of the individual performance dimensions from the 7-point OTS rating scale are in the first
section of Table 4.16 (Variables 1 through 7). They
range from a low of 3.66 for Maintaining Attention and
Situation Awareness to a high of 4.61 for Communicating
Clearly, Accurately and Efficiently. The scores from each
of the performance dimensions are slightly negatively
skewed, but are for the most part, normally distributed.
Variables 8 through 16 in Table 4.16 were collected
using the BEC. To reiterate, these scores represent
instances where the controllers had either made a mistake or engaged in some activity that caused a dangerous
situation, a delay, or in some other way impeded the
flow of air traffic through their sector. For example, a
Letter of Agreement (LOA)/Directive Violation was judged
to have occurred if a jet was not established at 250 knots
prior to crossing the appropriate arrival fix or if a
frequency change was issued prior to completion of a
handoff for the appropriate aircraft. On average, each
participant had 2.42 LOA/Directive Violations in each
scenario.
Table 4.17 contains interrater reliabilities for the
OTS Ratings for those 24 ratees for whom multiple rater
information was available. Overall, the interrater
reliabilities were quite high for the OTS ratings, with
Correlations Between the Criterion Measures:

Construct Validity Evidence
Table 4.19 depicts the relationships between scores
on the 84-item CBPM, the two HFPM factors, and the
combined supervisor/peer ratings. First, the correlation
between the CBPM total scores and the HFPM Factor
1, arguably our purest measure of technical proficiency,
is .54. This provides strong evidence for the construct
validity of the CBPM. Apparently, this lower fidelity
measure of technical proficiency is tapping much the
same technical skills as the HFPM, which had controllers working in an environment highly similar to their
actual job setting. In addition, a significant negative
correlation exists between the CBPM and the second
HFPM factor, Poor Sector Management.
Considerable evidence for the construct validity of
the ratings is also evident. Correlations between the
ratings and the first HFPM factor is .40. Thus, the
ratings, containing primarily technical proficiency-oriented content, correlate substantially with our highest
fidelity measure of technical proficiency. The ratings
10
#:386
also correlate significantly with the second HFPM

factor (r = -.28), suggesting the broad-based coverage of
the criterion space toward which the ratings were targeted. Finally, the ratings-CBPM correlation is .22,
suggesting that the ratings also share variance associated
with the judgment, decision-making, and procedural
knowledge constructs we believe the CBPM is measuring. This suggests that, as intended, the ratings on the
first two categories are measuring the typical performance component of technical proficiency.
Overall, there is impressive evidence that the CBPM
and the ratings are measuring the criterion domains they
were targeted to measure. At this point, and as planned,
we examined individual CBPM items and their relations to the other criteria, with the intention of dropping items that were not contributing to the desired
relationships. For this step, we reviewed the item-total
score correlations, and CBPM item correlations with
HFPM scores and the rating categories. Items with very
low or negative correlations with: (1) total CBPM
scores; (2) the HFPM scores, especially for the first
factor; and (3) the rating composite were considered for
exclusion from the final CBPM scoring system. Also
considered were the links to important tasks. The linkage analysis is described in a later section. Items representing one or more highly important tasks were given
additional consideration for inclusion in the final composite. These criteria were applied concurrently and in
a compensatory manner. Thus, for example, a quite low
item-total score correlation might be offset by a high
correlation with HFPM scores.
This item review process resulted in 38 items being
retained for the final CBPM scoring system. The resulting CBPM composite has a coefficient alpha of .61 and
correlates .61 and -.42 with the two HFPM factors, and
.24 with the rating composite. Further, coverage of the
40 most important tasks is at approximately the same
level, with all but one covered by at least one CBPM
item. Thus, the final composite is related more strongly
to the first HFPM factor, and correlates a bit more
highly with the technically-oriented rating composite.
We believe this final CBPM composite has even better
construct validity in relation to the other criterion
measures than did the total test.
in the validation of controller predictor measures. Some

of the more promising archival measures are those
related to training performance, especially the time to
complete various phases of training and ratings of
performance in these training phases. However, there
are some serious problems even with these most promising measures (e.g., standardization across facilities,
measures are not available for all controllers). Thus, our
approach in the present effort was to use these measures
to further evaluate the construct validity of the AT-SAT
criterion measures.
In general, training performance has been shown to
be a good predictor of job performance, so measures of
training performance should correlate with the ATSAT measures of job performance. Training performance data were available for 809 of the 1227 controllers
in the concurrent validation sample. Two of the on-thejob training phases (Phase 6 and Phase 9) are reasonably
standardized across facilities, so performance measures
from these two phases are good candidates for use as
performance measures. We examined the correlation
between ratings of performance across these two phases
and the correlations between five variables measuring
training time (hours and days to complete training at
each phase). The rating measures did not even correlate
significantly with each other, and thus were not included in further analyses. Correlations between the
training time variables were higher. Because the time
variables appeared to be tapping similar performance
dimensions, we standardized and added these measures
to create a training time scale. Controllers with less
than four out of the five variables measuring training
time were removed from further analyses (N=751).
Correlations between training time and ratings of performance are moderate (r = .23). The correlation with
CBPM scores is small but also significant (.08; p < .05).
Thus, the correlations with training time support the
construct validity of the AT-SAT field criterion measures. (Sample sizes for the HFPM were too small to
conduct these analyses.)
Linkage Analysis
A panel of 10 controller SMEs performed a judgment task with the CBPM items. These controllers
were divided into three groups, and each group was
responsible for approximately one third of the 40
critical tasks that were targeted by the CBPM. They
reviewed each CBPM scenario and the items, and
indicated which of these important tasks from the job
Additional Construct Validity Evidence

Hedge et al. (1993) discuss controller performance
measures that are currently collected and maintained by
the FAA and the issues in using these measures as criteria
11
#:387
analysis were involved in each item. These ratings

were then discussed by the entire group until a
consensus was reached. Results of that judgment task
appear in Table 4.20. For each task, the table shows
the number of CBPM items that this panel agreed
measured that task.
Similarly, 10 controller SMEs performed a judgment task with the seven HFPM scenarios. These
controllers were divided into two groups, and each
group was responsible for half of the scenarios. Each
scenario was viewed in three 10-minute segments,
and group members noted if a critical subactivity was
performed. After the three 10-minute segments for a
given scenario were completed, the group discussed
their ratings and arrived at a consensus before proceeding to the next scenario. Results of these judgments can also be found in Table 4.20. In summary,
38 of the 40 critical subactivities were covered by at
least a subset of the seven scenarios. On average,
almost 25 subactivities appeared in each scenario.
Conclusions
The 38-item CBPM composite provides a very good
measure of the technical skills necessary to separate
aircraft effectively and efficiently on the real job. The
.61 correlation with the highly realistic HFPM (Factor
1) is especially supportive of its construct validity for
measuring performance in the very important technical
proficiency-related part of the job. Additional ties to the
actual controller job are provided by the links of CBPM
items to the most important controller tasks identified
in the job analysis.
The performance ratings provide a good picture of
the typical performance over time elements of the job.
Obtaining both a supervisor and a peer perspective on
controller performance provides a relatively comprehensive view of day-to-day performance. High interrater
agreement across the two rating sources further strengthens the argument that the ratings are valid evaluations of
controller performance.
Thus, impressive construct validity evidence is demonstrated for both the CBPM and the rating composite.
Overall, we believe the 38-item CBPM and the rating
composite represent a comprehensive and valid set of
criterion measures.
12
#:388
CHAPTER 5.1
FIELD PROCEDURES
FOR
CONCURRENT VALIDATION STUDY
Lucy B. Wilson, Christopher J. Zamberlan, and James H. Harris

Caliber Associates
The concurrent validation data collection was carried

out in 12 locations from May to July, 1997. Additional
data were collected in 4 locations from March to May,
1998 to increase the sample size. Data collection activities involved two days of computer-aided test administration with air traffic controllers and the collection of
controller performance assessments from supervisory
personnel and peers. Each site was managed by a trained
Test Site Manager (TSM) who supervised trained onsite data collectors, also known as Test Administrators
(TAs). A subset of 100 air traffic controllers from the
May-July sample (who completed both the predictor
and criterion battery of testing and for whom complete
sets of performance assessment information were available), was selected to complete the high fidelity criterion
test at the Academy in Oklahoma City. See Chapter 4 for
a description of this activity.
The additional testing in 1998 ran in Chicago, Cleveland, Washington, DC, and Oklahoma City. The enroute centers of Chicago and Cleveland performed like
the original AT-SAT sites, testing their own controllers.
The en-route center at Leesburg, Virginia, which serves
the Washington, DC area, tested their controllers as well
as some from New York. At the Mike Monroney Aeronautical Center in Oklahoma City, the Civil Aeromedical Institute (CAMI), with the help of Omni personnel,
tested controllers from Albuquerque, Atlanta, Houston,
Miami, and Oakland. All traveling controllers were
scheduled by Caliber with the help of Arnold Trevette in
Leesburg and Shirley Hoffpauir in Oklahoma City.
Field Period
Data collection activities began early in the Ft. Worth
and Denver Centers in May, 1997. The remaining nine
centers came on line two weeks later. To ensure adequate
sample size and diversity of participants, one additional
field site Atlanta was included beginning in June
1997. The concurrent data collection activities continued in all locations until mid-July.
Of the four sites in 1998, Chicago started the earliest
and ran the longest, for a little over two months beginning in early March. Washington, DC began simultaneously, testing and rating for just under two months.
Cleveland and Oklahoma City began a couple of weeks
into March and ended after about four and five weeks,
respectively.
Criterion Measure Pretest

An in-field pretest of the computerized criterion
measure and the general protocol to be used in the
concurrent validation test was conducted in April, 1997.
The en-route air traffic control centers of Salt Lake City,
UT and Seattle, WA served as pretest sites. A trained
TSM was on site and conducted the pretest in each
location.
Field Site Locations
In 1997, the concurrent validation testing was conducted in 12 en-route air traffic control centers across the
country. The test center sites were:
Atlanta, GA
Albuquerque, NM
Boston, MA
Denver, CO
Ft. Worth, TX
Houston, TX
Selection and Training of Data Collectors

A total of 13 experienced data collection personnel
were selected to serve as TSMs during the first data
collection. One manager was assigned to each of the test
centers and one TSM remained on call in case an
emergency replacement was needed in the field.
All TSMs underwent an intensive 3-day training in
Fairfax, VA from April 22 to 24, 1997. The training was
led by the team of designers of the concurrent validation
tests. The objective of the training session was three-fold:
Jacksonville, FL
Kansas City, MO
Los Angeles, CA
Memphis, TN
Miami, FL
Minneapolis, MN
13
#:389
To acquaint TSMs with the FAA and the en route air

traffic control environment in which the testing was to
be conducted
To familiarize TSMs with the key elements of the
concurrent validation study and their roles in it
To ground TSMs in the AT-SAT test administration
protocol and field procedures.
time. While Oklahoma City had the capacity to test 15

controllers at a time, it did not use its expanded capability and operated like every other five-computer site, for
all intents and purposes.
At the beginning of the first day of the 2-day testing
effort, the data collector reviewed the Consent Form
with each participating controller and had it signed and
witnessed. (See the appendix for a copy of the Consent
Form.) Each controller was assigned a unique identification number through which all parts of the concurrent
validation tests were linked.
The predictor battery usually was administered on the
first day of controller testing. The predictor battery was
divided into four blocks with breaks permitted between
each block and lunch generally taken after completion of
the second block.
The second day of testing could occur as early as the
day immediately following the first day of testing or
could be scheduled up to several weeks later. The
second day of concurrent validation testing involved
completion of the computerized criterion test, that is,
the Computer Based Performance Measure (CBPM),
and the Biographical Information Form. (See appendix for a copy of the Biographical Information Form.)
At the end of the second day of testing, participating
controllers were asked to give their social security
numbers so that archival information (e.g., scores on
Office of Personnel Management employment tests)
could be retrieved and linked to their concurrent
validation test results.
A copy of the TSM training agenda is attached.

Each TSM was responsible for recruiting and training
his or her on-site data collectors who administered the
actual test battery. The TSM training agenda was adapted
for use in training on-site data collectors. In addition to
didactic instruction and role-playing, the initial test
administrations of all on-site data collectors were observed and critiqued by the TSMs.
Three TSMs repeated their role in the second data
collection. Because of the unique role of the fourth site
in the second data collection (e.g., a lack of previous
experience from the first data collection and three times
as many computers, or testing capability, as any other
testing site), Caliber conducted a special, lengthier training for CAMI personnel in Oklahoma City before the
second data collection began.
Site Set Up
TSMs traveled to their sites a week in advance of the
onset of data collection activities. During this week they
met with the en-route center personnel and the Partner
Pairs assigned to work with them. The Partner Pairs
were composed of a member of ATC management and
the union representative responsible for coordinating
the centers resources and scheduling the air traffic
controllers for testing. Their assistance was invaluable to
the success of the data collection effort.
TSMs set up and secured their testing rooms on site
during this initial week and programmed five computers
newly acquired for use in the concurrent validation.
They trained their local data collectors and observed
their first days work.
Supervisory Assessments
Participating controllers nominated two supervisory
personnel and two peers to complete assessments of them
as part of the criterion measurement. While the selection
of the peer assessors was totally at the discretion of the
controller, supervisory and administrative staff had more
leeway in selecting the supervisory assessors (although
not ones supervisor of record) from the much smaller
pool of supervisors in order to complete the ratings.
Throughout the data collection period, supervisors and
peers assembled in small groups and were given standardized instructions by on-site data collectors in the
completion of the controller assessments. To the extent
feasible, supervisors and peers completed assessments in
a single session on all the controllers who designated
them as their assessor. When the assessment form was
completed, controller names were removed and replaced
Air Traffic Controller Testing

Up to five controllers could, and frequently were,
tested on an 8-hour shift. Testing was scheduled at the
convenience of the center, with most of the testing
occurring during the day and evening shifts, although
weekend shifts were included at the discretion of the site.
Controllers were scheduled to begin testing at the same
14
#:390
by their unique identification numbers. The assessment

forms were placed in sealed envelopes as a further means
of protecting confidentiality.
During the second data collection, assessors sometimes viewed PDRIs How To video in lieu of verbal
instruction. This was especially important at the five
non-testing sites that had no TSMs or on-site data
collectors (Albuquerque, Atlanta, Houston, Miami, and
Oakland). The four testing sites employed the video
much less frequently, if at all.
tors transmitted completed test information (on diskettes) and hard copies of the Biographical Information
and performance assessment forms to the data processing
center in Alexandria, VA.
Site Shut Down
At the end of the data collection period, each site was
systematically shut down. The predictor and criterion
test programs were removed from the computers, as were
any data files. Record logs, signed consent forms, unused
test materials, training manuals and other validation
materials were returned to Caliber Associates. Chicago,
the last site of the second data collection effort, shut
down on Monday, May 11, 1998.
Record Keeping and Data Transmission

On-site data collectors maintained records of which
controllers had participated and which tests had been
completed. This information was reported on a daily
basis to TSMs. Several times a week on-site data collec-
15
#:391
16
#:392
CHAPTER 5.2
DEVELOPMENT OF PSEUDO- APPLICANT SAMPLE
Anthony Bayless, Caliber Associates
RATIONALE FOR
PSEUDO-APPLICANT SAMPLE
mates from an unrestricted sample (i.e., a pool of subjects

that more closely represents the potential range of
applicants), the underestimation of predictor validity
computed from the restricted sample can be corrected.
Prior to becoming a Full Performance Level (FPL)

controller, ATCSs have been previously screened on
their entry-level OPM selection test scores, performance in one of the academy screening programs, and
on-the-job training performance. Because of these
multiple screens and stringent cutoffs, only the better
performing ATCSs are retained within the air traffic
workforce. For these reasons, the concurrent validation of the AT-SAT battery using a sample of ATCSs
is likely to result in an underestimate of the actual
validity because of restriction in range in the predictors. The goal of this part of the project, then, was to
administer the AT-SAT predictor battery to a sample
that more closely resembled the likely applicant pool
than would a sample of ATCS job incumbents.
The purpose of including a pseudo-applicant (PA)
sample in the validation study was to obtain variance
estimates from an unrestricted sample (i.e., not explicitly
screened on any prior selection criteria). Data collected
from the PA study were used to statistically correct
predictor scores obtained from the restricted, concurrent
validation sample of ATCS job incumbents. This statistical correction was necessary because the validity of
predictors is based on the strength of the relationship
between the predictors and job performance criteria. If
this relationship was assessed using only the restricted
sample (i.e., FAA job incumbents who have already been
screened and selected) without any statistical correction,
the strength of the relationships between the predictors
and job performance criteria would be underestimated.1
This underestimation of the validity of the predictors
might lead to an omission of an important predictor
based on an inaccurate estimation of its validity. By
using the PA data to obtain variance/covariance esti-
ATCS Applicant Pool

The administration of the AT-SAT predictor battery
to a sample closely resembling the applicant pool required an analysis of the recent ATCS applicant pool.
Therefore, the project team requested from the FAA data
about recent applicants for the ATCS job. Because of a
recent hiring freeze on ATCS positions, the latest background data available for ATCS applicants was from
1990 through part of 1992. Although the data were
somewhat dated (i.e., 1990-1992), it did provide some
indication of the characteristics that should be emulated
in the PA sample. Based on a profile analysis provided by
the FAA, relevant background characteristics of 36,024
actual applicants for FAA ATCS positions were made
available. Table 5.2.1 provides a breakout of some pertinent variables from that analysis.
The data indicated that about 81% of applicants were
male, 50% had some college education but no degree,
and 26% had a bachelors degree. A disconcerting fact
from the OPM records was the large percentage of
missing cases (51.3%) for the race/ethnicity variable.
Information available for the race/ethnicity variable represented data from 17,560 out of 36,024 cases. Another
issue of some concern was the age of the data provided.
The latest data were at least four years old. Although it
seems unlikely that the educational profile of applicants
would have changed much over four years, it was more
likely that the gender and the race/ethnicity profiles may
have changed to some extent over the same period of time
(i.e., more female and ethnic minority applicants).
This underestimate is the result of decreased variation in the predictor scores of job incumbents; they would all be
expected to score relatively the same on these predictors. When there is very little variation in a variable, the strength of its
association with another variable will be weaker than when there is considerable variation. In the case of these predictors,
the underestimated relationships are a statistical artifact resulting from the sample selection.
17
#:393
Because of the concern about the age of the applicant

pool data and the amount of missing data for the race/
ethnicity variable, a profile of national background characteristics was obtained from the U.S. Bureau of the
Census. As shown in Table 5.2.2, 1990 data from the
U.S. Bureau of the Census indicated the following
national breakout for race/ethnicity:
Without more up-to-date and accurate data about the
applicant pool, the national data were used to inform
sampling decisions. Using the percentages provided above
for race/ethnicity upon which to base preliminary sampling plans, we recommended a total sample size of at
least 300 PAs be obtained assuming it followed the same
distributional characteristics as the national race/ethnicity
data.
areas). An example classified newspaper advertisement is

shown in Figure 5.2.1. Another means of advertising the
testing opportunity was to place flyers at locations in
proximity to the testing site. For example, flyers were
placed at local vocational technical schools and colleges/
universities. An example flyer advertisement is shown in
Figure 5.2.2. A third means of advertising the testing to
civilian PAs was to publicize the effort via ATCS to their
family, friends, and acquaintances.
When responding to any form of advertisement,
potential civilian PAs were requested to call a toll-free
number where a central scheduler/coordinator would
screen the caller on minimum qualifications (i.e., US
citizenship, ages between 17 and 30, AND at least 3
years of general work experience) and provide the
individual with background about the project and the
possible testing dates and arrival time(s). After a PA
had been scheduled for testing, the scheduler/coordinator would contact the testing site manager for the
relevant testing location and notify him/her so that
the testing time slot could be reserved for a PA instead
of an ATCS (for those sites testing PAs and ATCSs
concurrently). The scheduler/coordinator would also
mail a form letter to the newly scheduled PA indicating the agreed upon testing time and date, directions
to the testing facility, and things to bring with them
(i.e., drivers license and birth certificate or passport)
for verification of age and citizenship.
Pseudo-Applicant Sample Composition and

Characteristics
Again, the impetus for generating a PA sample was to
administer the AT-SAT predictor battery to a sample
that more closely resembled the likely applicant pool
than would a sample of ATCS job incumbents. The
project team decided to collect data from two different
pools of PAs: one civilian and the other military. The
civilian PA sample was generated using public advertisement and comprised the volunteers obtained from such
advertisement. Because the sample size of the civilian PA
sample was dependent on an unpredictable number of
volunteers, a decision was made to also collect data from
a military PA sample. The military PA sample afforded
a known and large sample size and access to scores on
their Armed Services Vocational Aptitude Battery
(ASVAB) with their granted permission. Each of these
two pools of PAs are described in the following two
subsections.
Military Pseudo-Applicant Sample

Because of the uncertainty about being able to
generate a sufficient PA sample from the civilian
volunteers, it was decided to collect additional data
from a military PA sample. Again, the military PA
sample would afford a known sample size and access
to their ASVAB scores which would prove useful for
validation purposes. For these reasons, the FAA negotiated with the U.S. Air Force to test participants at
Keesler A.F.B., Biloxi, Mississippi. The military PAs
were students and instructors stationed at Keesler
A.F.B. Predictor data were collected from approximately 262 military PAs of which 132 (50.4%) were
currently enrolled in the Air Traffic Control School;
106 (40.5%) were students in other fields such as
Weather Apprentice, Ground Radar Maintenance,
and Operations Resource Management; and 24 (9.2%)
were Air Traffic Control School instructors. Table
5.2.3 provides a breakout of gender and race/ethnicity
by type of sample.
Civilian Pseudo-Applicant Sample

Because the computer equipment with the predictor and criterion software was already set up at each of
the 12 CV testing sites, public advertisements were
placed locally around the CV testing sites to generate
volunteers for the civilian PA sample. The goal for
each testing site was to test 40 PAs to help ensure an
adequate civilian PA sample size.
Public advertisement for the civilian PA sample was
accomplished via several different methods. One method
was to place classified advertisements in the largest local,
metropolitan newspapers (and some smaller newspapers
for those CV sites located away from major metropolitan
18
#:394
The data in 5.2.1 indicate that the civilian and

military PA samples were very similar with respect to
their gender and race/ethnicity profiles. In addition,
both of the PA samples were more diverse than the
ATCS sample and fairly similar to the 1990 U.S.
Bureau of Census national breakdown (compare data
of Table 5.2.1 to data of Table 5.2.2).
Test site administrators provided the PAs with a

standardized introduction and set of instructions about
the testing procedures to be followed during the computer-administered battery. During the introduction the
administrators informed the PAs of the purpose of the
study and any risks and benefits associated with participation in the study. The confidentiality of each participants results were emphasized. In addition, participants
were asked to sign a consent agreement attesting to their
voluntary participation in the study, their understanding
of the purpose of the study, the risks/benefits of participation, and the confidentiality of their results. For the
military PAs, those who signed a Privacy Act Statement
gave their permission to link their predictor test results
with their ASVAB scores.
The testing volunteers were required to sacrifice one
eight-hour day to complete the predictor battery. Although testing volunteers were not compensated for
their time due to project budget constraints, they were
provided with compensation for their lunch.
On-Site Data Collection

Pseudo-applicants were administered the predictor
battery using the same testing procedures as followed for
the ATCS CV sample. The only differences between the
civilian and military PA sample data collection procedures were that:
1. civilians were tested with no more than four other
testing participants at a time (due to the limited
number of computers available at any one of the
testing sites), whereas military PAs at Keesler A.F.B.
were tested in large groups of up to 50 participants
per session.
2. the replacement caps for select keyboard keys
were not compatible with the rental computer keyboards and were unusable. Because of this problem,
index cards were placed adjacent to each of the computer test stations informing the test taker of the
proper keys to use for particular predictor tests. The use
of the index cards instead of the replacement keys did
not appear to cause any confusion for the test takers.
Correction for Range Restriction

As mentioned previously, the reason for collecting
predictor data from PAs was to obtain variance estimates
from individuals more similar to actual applicants for use
in correcting validity coefficients for tests derived from
a restricted sample (i.e., job incumbents). A description
of the results of the range restriction corrections is
contained in Chapter 5.5.
19
#:395
CHAPTER 5.3
DEVELOPMENT
OF
DATABASE
Ani S. DiFazio
HumRRO
The soundness of the validity and fairness analyses
conducted on the beta test data, and of the recommendations based on those results, was predicated on reliable
and complete data. Therefore, database design, implementation, and management were of critical importance
in validating the predictor tests and selecting tests for
inclusion in Version 1 of the Test Battery. The Validation Analysis Plan required many diverse types of data
from a number of different sources. This section describes the procedures used in processing these data and
integrating them into a cohesive and reliable analysis
database.
Initial Data Processing

Automated Test Files
Data Transmittals. The automated test data collected at the 17 test sites were initially sent to HumRRO
via Federal Express on a daily basis. This was done so that
analysts could monitor test sites closely in the beginning
of the test period and solve problems immediately as they
arose. Once confident that a test site was following the
procedures outlined in the AT-SAT Concurrent Validation Test Administration Manual and was not having
difficulty in collecting and transmitting data, it was put
on a weekly data transmittal schedule. Out of approximately seven and a half weeks of testing, the typical site
followed a daily transmittal schedule for the first two
weeks and then sent data on a weekly schedule for the
remainder of the testing period. In total, HumRRO
received and processed 297 Federal Express packages
containing data transmittals from the 17 test sites.
The sites were provided detailed instructions on the
materials to be included in a data transmittal packet.
First, packets contained a diskette of automated test files
for each day of testing.2 Sites were asked to include a
Daily Activity Log (DAL) if any problems or situations
arose that might affect examinee test performance. Along
with each diskette, the sites were required to submit a
Data Transmittal Form (DTF)3 which provided an
inventory of the pieces of data contained in the transmittal packet. During the testing period, HumRRO received and processed 622 hard copy DTFs.
Data Processing Strategy. Because of the magnitude
of data and the very limited time allocated for its processing, a detailed data processing plan was essential. The
three main objectives in developing a strategy for processing
the automated test data from the test sites were to
Data Collection Instruments

As described in section 5.1, data from computerized
predictor and criterion tests were automatically written
as ASCII files by the test software at the test sites.
Depending on the test, the data were written either as the
examinee was taking the test or upon completion of the
test. The data file structure written by each test program
was unique to that test. Each file represented an individual test taken by a single examinee. A complete
battery of tests consisted of 13 computerized predictor
tests as well as one computerized criterion test. For the
first AT-SAT data collection (AT-SAT 1), high-fidelity
criterion measures were also obtained on a subset of the
controller participants.
In addition to the automated test data, several different types of data were collected by hard copy data
collection instruments. These include three biographical
information forms for controller participants, pseudoapplicant participants, and assessors, a Request of SSN
for Retrieval of the Historical Archival Data form, and a
Criterion Assessment Rating Assessment Sheet. The
Validation Analysis Plan also called for the integration of
historical archival data from the FAA.
Some sites wrote the transmittal diskette at the end of the test day, while others cut the data at the end of a shift. In these
cases, more than one diskette would be produced for each test day.
3
While a DTF was supposed to be produced for each diskette transmitted, some sites sent one DTF covering a number of
test days, and, conversely, more than one DTF describing a single diskette.
21
#:396
Ensure that the test sites were transmitting all the data
they were collecting and that no data were inadvertently
falling through the cracks in the field.
Closely monitor the writing and transmittal of data by
the sites, so that problems would be quickly addressed
before large amounts of data were affected.
Identify and resolve problematic or anomalous files.
developed solely for this application, information on

participant data from each DTF was automated and Statistical Analysis System (SAS) DTF files were created. 5
This stage two automation of DTF hard copy forms
served both record keeping and quality assurance functions. To gauge whether the sites were transmitting all
the data they collected, the inventory of participant
predictor and CBPM test data listed on the DTF was
compared electronically to the files contained on the
diskette being processed.6 Whenever there was a discrepancy, the data processing software developed for this
application automatically printed a report listing the
names of the discrepant files. Discrepancies involving
both in fewer and more files recorded on the diskettes
than expected from the DTF were reported. Test site
managers/administrators were then contacted by the
data processors to resolve the discrepancies. This procedure identified files that test sites inadvertently omitted
in the data transmittal package.7
As helpful as this procedure was in catching data that
may have been overlooked at sites, it was able to identify
missing files only if the DTF indicated that they should
not be missing. The procedure would not catch files that
were never listed on the DTF. It was clear that this sort
of error of omission was more likely to occur when large
amounts of data were being collected at sites. While the
second AT-SAT data collection (AT-SAT 2) tested just
over 300 participants, AT-SAT 1 included over four and
a half times that number. Therefore, if this type of error
of omission was going to occur, it would likely occur
during the first AT-SAT data collection rather than the
second. To avoid this error, the AT-SAT 1 test site
managers needed to assess the completeness of the data
sent for processing against other records maintained at
To accomplish these objectives, the test data were

initially passed through two stages of data processing as
testing was in progress. A third processing stage, described in the later subsection Integration of AT-SAT
Data, occurred after testing was completed and served
to integrate the diverse data collected for this effort into
a reliable and cohesive database.
During the testing period, up to four work stations
were dedicated to processing data transmittal packets
sent by the sites. One work station was reserved almost
exclusively for preliminary processing of the packets.
This stage one processing involved unpacking Federal
Express transmittals, identifying obvious problems, date
stamping and transcribing the DTF number on all hard
copy data collection forms, summarizing AT-SAT 1
examinee demographic information for weekly reports,
and ensuring that the data were passed on to the next
stage of data processing.
The stage two data processors were responsible for
the initial computer processing of the test data. Their
work began by running a Master Login procedure that
copied the contents of each diskette transmitted by the
test sites onto the work stations hard drive. This procedure produced a hard copy list of the contents of the
diskette and provided a baseline record of all the data
received from the sites.4 Next, using a key entry screen
The Master Login software did not copy certain files, such as those with zero bytes.
In automating the DTF, we wanted one DTF record for each diskette transmitted. Because sites sometimes included the
information from more than one diskette on a hard copy DTF, more than one automated record was created for those
DTFs. Conversely, if more than one hard copy DTF was transmitted for a single diskette, they were combined to form one
automated DTF record.
6
This computerized comparison was made between the automated DTF and an ASCII capture of the DOS directory of the
diskette from the test site. The units of analysis in these two datasets were originally different. Since a record in the
directory capture data was a file (i.e., an examinee/test combination), there was more than one record per examinee. An
observation in the original DTF file was an examinee, with variables indicating the presence (or absence) of specific tests. In
addition, the DTF inventoried predictor tests in four testing blocks rather than as individual tests. Examinee/test-level data
were generated from the DTF by producing dummy electronic DTF records for each predictor test that was included in a
test block that the examinee took. Dummy CBPM DTF records were also generated in this manner. By this procedure, the
unit of analysis in the automated DTF and DOS directory datasets was made identical and a one-to-one computerized
comparison could be made between the DTF and the data actually received.
7
Conversely, this procedure was also used to identify and resolve with the sites those files that appeared on the diskette, but
not on the DTF.
5
22
#:397
the site, such as the Individual Control Forms. Approximately three quarters into the AT-SAT 1 testing period,
the data processors developed a table for each site that
listed examinees by the types of data8 that had been
received for them. A sample of this table and the cover
letter to test site managers is provided in Appendix I. The
site managers were asked to compare the information on
this table to their Individual Control Forms and any
other records maintained at the site. The timing of this
exercise was important because, while we wanted to
include as many examinees as possible, the test sites still
had to be operational and able to resolve any discrepancies discovered. The result of this diagnostic exercise was
very encouraging. The only type of discrepancy uncovered was in cases where the site had just sent data that had
not yet been processed. Because no real errors of omission were detected and since AT-SAT 2 involved fewer
cases that AT-SAT 1, this diagnostic exercise was not
undertaken for AT-SAT 2.
Further quality assurance measures were taken to
identify and resolve any systematic problems in data
collection and transmission. Under the premise that
correctly functioning test software would produce files
that fall within a certain byte size range and that malfunctioning software would not, a diagnostic program was
developed to identify files that were too small or too big,
based on normal ranges for each test. The objective was
to avoid pervasive problems in the way that the test
software wrote the data by reviewing files with suspicious
byte sizes as they were received. To accomplish this, files
with anomalous byte sizes and the pertinent DALs were
passed on to a research analyst for review. A few problems
were identified in this way. Most notably, we discovered
that the software in the Scan predictor test stopped
writing data when the examinee did not respond to test
items. Also, under some conditions, the Air Traffic
Scenarios test software did not write data as expected;
investigation indicated that the condition was rare and
that the improperly written data could, in fact, be read
and used, so the software was not revised. No other
systematic problems in the way the test software wrote
data were identified.
This procedure was also one way to identify files with
problems of a more idiosyncratic nature. The identification of file problems by the data processors was typically
based on improper file name and size attributes. In some

cases, the sites themselves called attention to problems
with files whose attributes were otherwise normal. In
most cases, the problem described by the site involved
the use of an incorrect identification number for an
examinee in the test start-up software. A number of other
situations at the test sites led to problematic files, such as
when a test administrator renamed or copied a file when
trying to save an examinees test data in the event of a
system crash. Very small files or files containing zero
bytes would sometimes be written when an administrator logged a participant onto a test session and the
examinee never showed up for the test. In the first few
weeks of testing, a number of files used by test site
managers to train administrators had then been erroneously transmitted to the data processors. It is important
to note that the contents of the test files were not
scrutinized at this stage of processing.
The stage two processors recorded each problem
encountered in a Problem Log developed for this purpose. The test site manager or administrator was then
contacted and the test site and data processor worked
together to identify the source of the problem. This
approach was very important because neglected systematic data collection and transmittal issues could have had
far-reaching negative consequences. Resolution of the
problem typically meant that the test site would retransmit the data, the file name would be changed
according to specific manager/administrator instructions, or the file would be excluded from further processing. For each problem identified, stage two data processors
reached a resolution with the test sites, and recorded that
resolution in the processors Problem Log.
Once all of these checks were made, data from the test
sites were copied onto a ZIP9 disk. Weekly directories on
each ZIP disk contained the test files processed during a
given week for each stage two work station. The data in
the weekly directories were then passed on for stage
three processing. To ensure that only non-problematic
files were retained on the ZIP disks and that none were
inadvertently omitted from further processing, a weekly
reconciliation was performed that compared all the test
files processed during the week (i.e., those copied to the
work stations hard drive by the Master Login procedure)
to the files written on the weeks ZIP disk. A computer
This table reported whether predictor and CBPM test data, participant biographical information forms, and SSN Request
Forms had been received.
9
ZIP disks are a virtually incorruptible data storage medium that hold up to 100 megabytes of data.
23
#:398
application was written that automatically generated

the names of all the discrepant files between these two
sources.
Every week, each stage two data processor met with
the database manager to discuss these discrepancies. The
data processor had to provide either a rationale for the
discrepancy or a resolution. The most typical rationale
was that the data processor was holding out a file or
waiting for the re-issuance of a problem file from the test
site. Meticulous records were kept of these hold-out
files and all were accounted for before the testing periods
were completed. Resolutions of discrepancies typically
included deletion or addition of files or changes to file
names. In these cases, the database manager handled
resolutions and the reconciliation program was reexecuted to ensure accuracy. These procedures resulted in a total of 23,107 files10 written onto ZIP disk
at the conclusion of stage two processing for AT-SAT
1 and 2 combined.
So as not to waste analysis time during AT-SAT 1, raw
CBPM test files contained on weekly ZIP disks were sent
to PDRI on a weekly basis during the testing period,
along with the DALs and lists of files with size problems.
During AT-SAT 2, CBPM files were sent to PDRI at the
end of the testing period; DALs and DTFs were sent to
PDRI directly from the sites. Similarly, Analogies (AN),
Planes (PL), Letter Factory (LA), and Scan (SC) raw test
files were sent to RGI on a weekly basis during AT-SAT
1 and at the end of the testing period for AT-SAT 2. At
the end of the AT-SAT 1 testing period, all the collected
data for each of these tests were re-transmitted to the
appropriate organization, so that the completeness of the
cumulative weekly transmittals could be assessed against
the final complete transmittal.
HumRRO wrote computer applications that read the
raw files for a number of predictor tests. These tests,
which contained multiple records per examinee, were
reconfigured into ASCII files with a single record for
each participant for each test. SAS files were then created
for each test from these reconfigured files. This work was
performed for the following tests: Applied Math (AM),
Dials (DI), Memory 1 (ME), Memory 2 (MR), Sound
(SN), Angles (AN), Air Traffic Scenarios (AT), Time
Wall (TW), and the Experience Questionnaire (EQ). At
the conclusion of testing, the reconfigured EQ data were
sent to PDRI for scoring and analysis.
Hard Copy Data

Data Handling of Participant Biographical Data
and Request for SSN Forms. As mentioned above, stage
one processors handled the data transmittal packages
from the test sites. Once each hard copy form had been
date stamped, these processors passed the participant
biographical forms and SSN Request Forms to stage two
processors. Here, as in the processing of automated test
data, to ensure that all the data indicated on the DTF had
been sent, a report printed by the DTF automation
program listed all the hard copy participant forms that
the DTF indicated should be present for an examinee.
The stage two data processors were then required to find
the hard copy form and place a check mark in the space
provided by the reporting program. As with the automated test data, all problems were recorded in the data
processors Problem Log and the test sites were contacted
for problem resolution.
Data Handling of Assessor Biographical Data and
Criterion Assessment Rating Sheets: As discussed earlier, the automated DTF file contained information
recorded on the first page of the DTF form describing
the participant data transmitted from the site. The
second page of the hard copy DTF contained information on assessor dataspecifically, whether a Confidential Envelope, which contained the Criterion Rating
Assessment Sheet(s) (CARS), and an Assessor Biographical Form were present in the data transmittal package.
HumRRO handled assessor biographical data and the
Criterion Rating Assessment Sheets during AT-SAT 1;
these hard copy instruments were processed by PDRI
during AT-SAT 2. As with other types of data, to ensure
that all collected assessor information was actually transmitted, stage one processors compared the assessor data
contained in each data transmittal package to the information contained on the DTF. Test sites were informed
of all discrepancies by e-mailed memoranda or telephone
communication and were asked to provide a resolution
for each discrepancy. Because the assessors were often
asked to provide CARS ratings and complete the Assessor Biographical Data Form at the same time, they often
included the biographical form in the Confidential
Envelope along with the CARS. As a consequence, the
test site administrator did not have first-hand knowledge
of which forms were contained in the envelopes. In
processing the hard copy assessor data, there were a total
10 The 23,107 files were comprised of the CBPM test, the 13 predictor tests, and one start-up (ST) file for controller
examinees and 13 predictor tests, and one start-up (ST) file for pseudo-applicants.
24
#:399
of 2911 assessor discrepancies between the data actually

received and the data the DTF indicated should have
been received. Of these 29, only four discrepancies could
not be resolved. In these instances the assessor simply
may not have included in the Confidential Envelope the
forms that the administrator thought were included.
Data Automation. Hard copy forms that passed
through to stage two processing were photocopied and
the originals filed awaiting automation. Since there were
no other copies of these data, photocopies insured against
their irrevocable loss, particularly once they were sent to
key-punch. All original and photocopied Request for
SSN Forms were stored in a locked cabinet. Five separate
ASCII key entry specifications were developed by the
AT-SAT database manager: for the three biographical
data instruments, the CARS form, and the Request for
SSN Form. The database manager worked closely with
the data automation company chosen to key enter the
data. The data were double-keyed to ensure accuracy.
Once the data were keyed and returned, the total number
of cases key entered were verified against the total number of hard copy forms sent to key-punch. Data were sent
to key-punch in three installments during the course of
AT-SAT 1 testing; a small fourth installment comprised
of last minute stragglers was keyed in-house. CAR and
assessor biographical AT-SAT 2 data were sent to keypunch in two installments during testing and a small
third installment of stragglers was keyed in-house by
PDRI. In AT-SAT 1, automated files containing assessor and participant biographical data and criterion ratings data were sent to PDRI a few times during the course
of testing; complete datasets were transmitted when
testing was concluded.
could not be returned once the historical information

had been extracted. Therefore, examinee number or SSN
could not be used as the link between records in the
historical data and the other AT-SAT data collected. To
overcome this problem, a unique random identification
number was generated for each controller examinee who
submitted a Request for SSN form in AT-SAT 1 and 2.
Electronic files containing the SSN, this random identification number, and site number were sent to the FAA.
Of the 986 controllers who submitted a Request for SSN
Form, 967 had non-missing SSNs that could be linked
to the FAA archival data. In addition to these 967 SSNs,
the FAA received 4 SSN Forms during the high fidelity
testing in Oklahoma City, which increased the number
of cases with historical data to 971.
Pseudo-Applicant ASVAB Data
AFQT scores and composite measures of ASVAB
subtests G (General), A (Administrative), M (Mechanical), and E (Electronic) were obtained for Kessler pseudoapplicants and merged with test and biographical data
during stage three data processing.
Integration of AT-SAT Data
The goal in designing the final AT-SAT database was
to create a main dataset that could be used to address
most analytic needs, with satellite datasets providing
more detailed information in specific areas. Before the
database could be created, data processors needed to
perform diagnostic assessments of the accuracy of the
data and edit the data on the basis of those assessments.
Stage three data processing activities included these
diagnostic data checks and edits, as well as data merging
and archive.
Historical Data
Confidentiality of test participants was a primary
concern in developing a strategy for obtaining historical
data from the FAA computer archives and linking that
data to other AT-SAT datasets. Specifically, the objective was to ensure that the link between test examinees
and controllers was not revealed to the FAA, so that test
results could never be associated with a particular employee. Also, although the FAA needed participant controller Social Security Numbers (SSN) to identify and
extract cases from their historical archives, these SSNs
Data Diagnostics and Edits

Since the data contained on the test files were written
by test software that was generally performing as expected, there were no errors in data recordation, and
therefore no need for large-scale data editing. There were
two types of diagnostic checks to which the test files were
subjected, however. First, a check was made to see
whether an examinee had taken the same test more than
once. It is a testament to the diligent work of the test sites
and the data processors that this anomaly was not evident
11
The total number of assessor discrepancies e-mailed to sites was 41. For 12 participant assessors, the test administrator
indicated the presence of an assessor biographical form on the DTF when a participant biographical form had actually been
completed. Therefore, the number of true assessor discrepancies was 29.
25
#:400
in the data. Second, the test analysts performed diagnostics to identify observations that might be excluded from
further analysis, such as those examinees exhibiting
motivational problems. Obviously, historical data from
the FAA archives were not edited. Data collected on hard
copy instruments were subjected to numerous internal
and external diagnostic and consistency checks and
programmatic data editing. A primary goal in data
editing was to salvage as much of the data as possible
without jeopardizing accuracy.
Participant Biographical Data. Several different types
of problems were encountered with the participant biographical data:
been performing various duties, when only the month or

year component was missing, the missing item was
coded as zero. Also, for consistency, year was always
made to be included in the year, rather than month (e.g.,
24 months), field. When year was reported in the month
field, the year field was incremented by the appropriate
amount and the month field re-coded to reflect any
remaining time less than that year(s).
Second, a suspiciously large group of controller participants reported their race as Native American/Alaskan Native on the biographical form. To check the
accuracy of self-reported race, the responses were compared to the race/ethnic variable on the historical FAA
archive data. For those controllers with historical data,
racial affiliation from the FAA archives was used rather
than self-reported race as a final indication of controller
race. The following frequencies of race from these two
sources of information show some of the discrepancies
(Source 1 represents self-reported race from biographical
form only, and Source 2 represents race based on archival
race when available and self reported race, when it was
not). Using Source 1, there were 77 Native American/
Alaskan, compared to 23 using Source 2. Similarly there
were 9 and 7 Asian/Pacific Islander respectively (Source
1 is always given first), 95 and 98 African Americans, 64
and 61 Hispanic, 804 and 890 Non-Minority, 20 and 8
Other , and 4 and 1 Mixed Race. This gives a total of
1073 participants by Source 1 and 1088 by Source 2,
with 159 Source 1 and 144 missing Source 2 data.
(Counts for Other were produced after Other was recoded into one of the five close-ended specified item
alternatives whenever possible.)
All edits were performed programmatically, with hard
copy documentation supporting each edit maintained in
a separate log. In 33 cases, participant assessors completed only assessor rather than participant biographical
forms. In these cases, biographical information from the
assessor form was used for participants.
Assessor Biographical Data. Like the participant
data, the assessor biographical data required substantial data
cleaning. The problems encountered were as follows:
More than one biographical information form completed by the same participant
Missing or out-of-range examinee identification number
Out-of-range date values
First, to correct the problem of duplicate12 biographical forms for the same examinee, all forms completed
after the first were deleted. Second, information from the
DTF sent with the biographical form often made it
possible to identify missing examinee numbers through
a process of elimination. Investigation of some out-ofrange examinee numbers revealed that the digits had
been transposed at the test site. Third, out-of-range date
values were either edited to the known correct value or set
to missing when the correct value was unknown.
Other data edits were performed on the controller and
pseudo-applicant participant biographical data. A number of examinees addressed the question of racial/ethnic
background by responding Other and provided openended information in the space allowed. In many cases,
the group affiliation specified in the open-ended response could be re-coded to one of the five specific
alternatives provided by the item (i.e., Native American/
Alaskan Native, Asian/Pacific Islander, African American, Hispanic, or Non-Minority). In these cases, the
open-ended responses were recoded to one of the closeended item alternatives. In other cases, a sixth racial
category, mixed race, was created and applicable openended responses were coded as such.
Two types of edits were applicable only to the controller sample. First, in biographical items that dealt with the
length of time (months and years) that the controller had
More than one biographical information form completed by the same assessor
Incorrect assessor identification numbers
12
The word duplicate here does not necessarily mean identical, but simply that more than one form was completed by a
single participant. More often than not, the duplicate forms completed by the same participant were not identical.
26
#:401
First, the same rule formulated for participants, deleting all duplicate biographical records completed after the
first, was applied. Second, by consulting the site Master
Rosters and other materials, misassigned or miskeyed13
rater identification numbers could be corrected. Third,
out-of-range date values were either edited to the known
correct value (i.e., the year that all biographical forms
were completed was 1997) or set to missing when the
correct value was unknown.
In addition to data corrections, the race and time
fields in the assessor data were edited following the
procedures established in the participant biographical
data. Open-ended responses to the racial/ethnic background item were re-coded to a close-ended alternative
whenever possible. In addition, when only the month or
year component in the time fields was missing, the
missing item was coded as zero. When full years were
reported in the month field (e.g., 24 months), the year
field was incremented by the appropriate amount and
the month field re-coded to reflect any remaining time
less than a year.
Since the test sites were instructed to give participants
who were also assessors a participant, rather than assessor, biographical form, data processors also looked for
biographical information on raters among the participant data. Specifically, if an assessor who provided a
CARS for at least one participant did not have an assessor
biographical form, participant biographical data for that
assessor were used, when available
Criterion Ratings Data. Of all the hard copy data
collected, the CARS data required the most extensive
data checking and editing. Numerous consistency checks
were performed within the CARS dataset itself (e.g.,
duplicate rater/ratee combinations), as well as assessing
its consistency with other datasets (e.g., assessor biographical data). All edits were performed programmatically, with hard copy documentation supporting each
edit maintained in a separate log. The following types of
problems were encountered:
First, the vast majority of missing or incorrect identification numbers and/or rater/ratee relationships were
corrected by referring back to the hard copy source and/
or other records. In some cases the test site manager was
contacted for assistance. Since the goal was to salvage as
much data as possible, examinee/rater numbers were
filled in or corrected whenever possible by using records
maintained at the sites, such as the Master Roster.
Problems with identification numbers often originated
in the field, although some key-punch errors occurred
despite the double-key procedure. Since examinee number on a CARS record was essential for analytic purposes,
six cases were deleted where examinee number was still
unknown after all avenues of information had been
exhausted.
Second, some raters provided ratings for the same
examinee more than once, producing records with duplicate rater/ratee combinations. In these cases, hard copy
sources were reviewed to determine which rating sheet
the rater had completed first; all ratings produced
subsequently for that particular rater/ratee combination were deleted.
Third, some cases were deleted based on specific
direction from data analysts once the data had been
scrutinized. These included rater/ratee combinations
with more than 3 of the 11 rating dimensions missing,
outlier ratings, ratings dropped due to information in the
Problem Logs, or incorrect assignment of raters to ratees
(e.g., raters who had not observed ratees controlling
traffic). Fourth, CARS items that dealt with the length of
time (months and years) that the rater had worked with
the ratee were edited, so that when only the month or
year component was missing, the missing item was
coded as zero. Where full years were reported in the
month field, the year field was incremented and the
month field re-coded to reflect any remaining time.
AT-SAT Database
As stated above, the database management plan called
for a main AT-SAT dataset that could address most
analytic needs, with satellite datasets that could provide
detailed information in specific areas. The AT-SAT
Database, containing data from the alpha and beta tests,
is presented in Figure 5.3.1. To avoid redundancy,
datasets that are completely contained within other
datasets are not presented separately in the AT-SAT
Missing or incorrect examinee/rater numbers

Missing rater/ratee relationship
Duplicate rater/ratee combinations
Rater/ratee pairs with missing or outlier ratings or
involved in severe DAL entries
13
The miskeying was often the result of illegible handwriting on the hard copy forms.
27
#:402
Database. For example, since participant biographical

data is completely contained in the final summary dataset,
it is not provided as a separate satellite dataset in the ATSAT Database. Similarly, since the rater biographical
data contains all the data recorded on the assessor biographical form, as well as some participant forms, the
assessor biographical form is not listed as a separate
dataset in the AT-SAT Database. All data processing for
the AT-SAT Database was done in the Statistical Analysis System (SAS). The datasets contained in the archived
AT-SAT Database were stored as portable Statistical
Package for the Social Sciences (SPSS) files.
Alpha Data. The Alpha data consist of a summary
dataset as well as scored item level test data from the
Pensacola study conducted in the spring of 1997. Scored
test data and biographical information are stored in the
summary dataset called SUMMARY.POR. Item level
scored test data are contained in 14 individual files
named xx_ITEMS.POR, where xx is the predictor test
acronym; an additional 15th file called AS_ITEMS.POR
contains ASVAB test scores.
Beta Test Data. The Final Analytic Summary Data
file in the AT-SAT database is comprised of a number of
different types of data:
ASVAB data were added. Participants for whom at least

one CARS had been completed also had variable(s)
appended to their main record containing the identification number of their assessor(s), so that examineelevel and assessor-level data can be easily linked. Test
variable names always begin with the two letter test
acronym; the names of biographical items in this data
file begin with BI.
This main analysis dataset is called XFINDAT5.POR
and contains 1,752 cases with 1,466 variables.14
The satellite test and rating data in the AT-SAT
Database are comprised of three types of files. The first
group consists of the 23,107 raw ASCII examinee test
(predictor and CBPM) files stored in weekly data processing directories. The processing of these data is described in the subsection, Initial Data Processing,
Automated Test Files. These raw files are included in the
AT-SAT Database primarily for archival purposes. Second, there is the electronic edited version of the CARS
hard copy data, called CAR.POR, which is described in
the subsection, Initial Data Processing, Hard Copy
Data. This file is also included in the AT-SAT Database
mainly for purposes of data archive. The third group of
files contains complete scored item-level test data for
examinees, derived from the first two types of data files
listed above. The predictor scored item-level files (e.g.,
EQ_ITEM.POR, AM_ITEMS.POR) were derived from
the raw ASCII predictor test files; the criterion file
(CR_ITEMS.POR) was derived from raw CBPM test
files and the CAR data.15 Salient variables from these
scored item-level test files constitute the test data in the
analytic summary file XFINDAT5.POR.
Biographical Data were also included in the beta test
datasets. Complete examinee biographical data are contained in the analytic summary file XFINDAT5.POR
and are, therefore, not provided as a separate file in the
database. Biographical information on assessors only
and participant assessors is contained in the dataset
called XBRATER.POR and is described in the subsection, Initial Data Processing, Hard Copy Data.
Data Archive. The AT-SAT database described above
is archived on CD-ROM. Figure 5.3.2 outlines the
directory structure for the AT-SAT CD-ROM data
archive. The root directory contains a README.TXT
Subset of scored test variables

Complete historical FAA archive data
Participant biographical information
ASVAB data for Keesler participants
Information on rater identification numbers
As stated previously, HumRRO, RGI, and PDRI

were each responsible for developing and analyzing
specific tests in the beta test battery. The results of these
analyses are presented in detail elsewhere in this report.
Once the tests had been scored, each organization returned the scored item-level data to the AT-SAT database manager. Salient scored variables were extracted
from each of these files and were linked together by
examinee number. This created an examinee-level dataset
with a single record containing test information for each
examinee. Participant biographical data and historical
FAA archive data were merged to this record, also by
examinee number. For Keesler pseudo-applicants,
14
The following FAA-applied alphanumeric variables were assigned an SPSS system missing value when the original value
consisted of a blank string: CFAC, FAC, FORM, IOPT, OPT, ROPT, STATSPEC, TTYPE , and @DATE. The following
FAA-supplied variables were dropped since they contained missing values for all cases: REG, DATECLRD, EOD,
FAIL16PF, P_P, and YR.
15
This file also contains scored High Fidelity test data.
28
#:403
file that provides a brief description of the t; it also

contains two subdirectories. The first subdirectory contains Alpha data, while the second contains data for the
Beta analysis. Within the Alpha subdirectory, there are
two subdirectories, Final Summary Data and Examinee Item Level Scored Data, each of which contain data
files. The Beta subdirectory contains the following
subdirectories:
Each Beta subdirectory contains data files. In addition, the Final Analytic Summary Data subdirectory
contains a codebook for XFINDAT5.POR. The
codebook consists of two volumes that are stored as
Microsoft Word files CBK1.DOC and CBK2.DOC.
The CBK1.DOC file contains variable information
generated from an SPSS SYSFILE INFO. It also contains a Table of Contents to the SYSFILE INFO for ease
of reference. The CBK2.DOC file contains frequency
distributions for discrete variables, means for continuous data elements, and a Table of Contents to these
descriptive statistics.16
Edited Criterion Assessment Rating Sheets

Edited Rater Biodata Forms
Examinee Item Level Scored Test Data
Final Analytic Summary Data
Raw Examinee Test Data in Weekly Subdirectories
Scaled, Imputed, and Standardized Test Scores
16
Means were generated on numeric FAA-generated historical variables unless they were clearly discrete.
29
#:404
CHAPTER 5.4
BIOGRAPHICAL AND COMPUTER EXPERIENCE INFORMATION:
DEMOGRAPHICS FOR THE VALIDATION STUDY
Patricia A. Keenan, HumRRO
CONTROLLER SAMPLE
This chapter presents first, the demographic characteristics of the participants in both the concurrent validation and the pseudo-applicant samples. The data on
the controller sample are presented first, followed by the
pseudo-applicant information. The latter data divided
between civilian and military participants. It should be
noted that not all participants answered each question in
the biographical information form, so at times the numbers will vary or cumulative counts may not total 100%.
Participant Demographics
A total of 1,232 FAA air traffic controllers took part
in the concurrent validation study. 912 controllers were
male (83.7%), 177 controllers were female (16.3%). 143
participants did not specify their gender so their participation is not reflected in analyses. The majority of the
data was collected in 1997. A supplementary data collection was conducted in 1998 to increase the minority
representation in the sample. A total of 1,081 controllers
participated in the 1997 data collection; 151 additional
controllers participated in 1998. Table 5.4.4 shows the
cross-tabulation of race and gender distribution for the
1997 and 1998 samples, as well as the combined numbers across both years. 143 individuals did not report
their gender and 144 did not report their race. These
individuals are not reflected in Table 5.4.4. The average
age of the controllers was 37.47 (SD = 5.98), with ages
ranging from 25 to 60 years. The mean was based on
information provided by 1,079 of the participants; age
could not be calculated for 153 participants.
Also of interest was the educational background of the
controllers. Table 5.4.5 shows the highest level of education achieved by the respondents. No information on
education was provided by 145 controllers.
TOTAL SAMPLE
Participant Demographics
A total of 1,752 individuals took part in the study
(incumbents and pseudo-applicants); 1,265 of the participants were male (72.2%) and 342 were female
(19.5%). 145 participants did not indicate their gender;
149 did not identify their ethnicity. The cross-tabulation of ethnicity and gender, presented in Table 5.4.1,
represents only those individuals who provided complete information about both their race and gender.
The sample included incumbent FAA controllers,
supervisors and staff (Controller sample) as well as
pseudoapplicants from Keesler Air Force base (Military
PA sample) and civilian volunteers from across the
country (Civilian PA sample). The pseudo-applicants
were selected based on demographic similarity to expected applicants to the controller position. The estimated average age of the total sample was 33.14 years
(SD = 8.43). Ages ranged from 18 to 60 years. This
number was calculated based on the information from
1,583 participants; 169 people did not provide information about their date of birth and were not included in
this average.
Participants were asked to identify the highest level of
education they had received. Table 5.4.2 presents a
breakdown of the educational experience for all participants. (151 people did not provide information about
their educational background.) The data were collected
at 18 locations around the U.S. Table 5.4.3 shows the
number of participants who tested at each facility.
Professional Experience
The controllers represented 17 enroute facilities. The
locations of the facilities and the number of controller
participants at each one are shown in Table 5.4.6. A total
of 1,218 controllers identified the facility at which they
are assigned; 14 did not identify their facility.
One goal of the study was to have a sample composed
of a large majority of individuals with air traffic experience, as opposed to supervisors or staff personnel. For
this reason, participants were asked to identify both their
current and previous positions. This would allow us to
identify everyone who had current or previous experience in air traffic control. Table 5.4.7 indicates the
average number of years the incumbents in each job
31
#:405
COMPUTER USE AND EXPERIENCE

QUESTIONNAIRE
category had been in their current position. 142 controllers did not indicate their current position. The air traffic
controller participant sample included journeyman controllers, developmental controllers, staff and supervisors,
as well as holding several other positions. These other
positions included jobs described as Traffic Management Coordinator.
Overall, the participants indicated they had spent an
average of 4.15 years in their previous position. These
positions included time as journeyman controller, developmental controller, staff, supervisor or other position.
Those responding Other included cooperative education students, Academy instructors, and former Air
Force air traffic controllers.
One goal of the biographical information form was to
get a clear picture of the range and length of experience
of the participants in the study. To this end they were
asked the number of years and months as FPL, staff, or
supervisor in their current facility and in any facility. The
results are summarized in Table 5.4.8. Few of the respondents had been in staff or supervisory capacity for more
than a few months. Half of the respondents had never
acted in a staff position and almost two-thirds had never
held a supervisory position. The amount of staff experience ranged from 0 to 10 years, with 97.6% of the
participants having less than four years of experience.
The findings are similar for supervisory positions; 99%
of the respondents had seven or fewer years of experience.
This indicates that our controller sample was indeed
largely composed of individuals with current or previous
controller experience.
Also of interest was the amount of time the incumbents (both controllers and supervisors) spent actually
controlling air traffic. Respondents were asked how they
had spent their work time over the past six months and
then to indicate the percentage of their work time they
spent controlling traffic (i.e., plugged-in time) and the
percentage they spent in other job-related activities (e.g.,
crew briefings, CIC duties, staff work, supervisory duties). The respondents indicated that they spent an
average of 72.41% of their time controlling traffic and
23.33% of their time on other activities.
To determine if individual familiarity with computers could influence their scores on several of the tests in
the predictor battery, a measure of computer familiarity
and skill was included as part of the background items.
The Computer Use and Experience (CUE) Scale, developed by Potosky and Bobko (1997), consists of 12 5point Likert-type items (1= Strongly Disagree, 2 =
Disagree, 3 = Neither Agree nor Disagree, 4 = Agree, 5 =
Strongly Agree), which asked participants to rate their
knowledge of various uses for computers and the extent
to which they used computers for various reasons. In
addition, 5 more items were written to ask participants
about actual use of the computer for such purposes as
playing games, word processing and using e-mail. The
resulting 17-item instrument is referred to in this report
as the CUE-Plus.
Item Statistics
The means and standard deviations for each item are
presented in Table 5.4.10. The information reported in
the table includes both the Air Traffic Controller participants and the pseudo-applicants. Overall, the respondents show familiarity with computers and use them to
different degrees. Given the age range of our sample, this
is to be expected. As might be expected, they are fairly
familiar with the day-to-day uses of computers, such as
doing word processing or sending email. Table 5.4.11
shows the item means and standard deviations for each
sample, breaking out the civilian and military pseudoapplicant samples and the controller participants. The
means for the samples appear to be fairly similar. Table
5.4.12 shows the inter-item correlations of the CUEPlus items. All the items were significantly correlated
with each other.
Reliability of Cue-Plus
Using data from 1,541 respondents, the original 12item CUE Scale yielded a reliability coefficient (alpha) of
.92. The scale mean was 36.58 (SD = 11.34). The CUEPlus, with 17 items and 1,533 respondents, had a reliability coefficient (alpha) of .94. The scale mean was
51.47 (SD = 16.11). Given the high intercorrelation
between the items, this is not surprising. The item-total
statistics are shown in Table 5.4.13. There is a high
degree of redundancy among the items. The reliability
coefficient for the samples are as follows: controllers, .93,
PSEUDO-APPLICANT SAMPLE
A total of 518 individuals served as pseudo-applicants
in the validation study; 258 individuals from Keesler Air
Force Base and 256 civilians took part in the study. The
racial and gender breakdown of these samples is shown
in Table 5.4.9.
32
#:406
civilian pseudo-applicants, .91, and military pseudo

applicants, .93, indicating that there were no large differences between sub-groups in responding to the CUEPlus items.
parison group. The differences were very low to moderate, with the absolute value of the range from .04 to .31.
The highest d scores were in the Military PA sample.
Caucasians scored higher than the comparison groups in
all cases except for the Civilian PA, in which AfricanAmericans scored higher than Caucasians.
Factor Analysis
Principal components analysis indicated that CUEPlus had two factors, but examination of the second
factor showed that it made no logical sense. Varimax and
oblique rotations yielded the same overall results. The
item I often use a mainframe computer system did not
load strongly on either factor, probably because few
individuals use mainframe computers. The varimax rotation showed an inter-factor correlation of .75. Table
5.4.14 shows the eigenvalues and percentages of variance
accounted for by the factors. The eigenvalues and variance accounted for by the two-factor solution are shown
in Table 5.4.15. The first factor accounts for over half of
the variance in the responses, with the second factor
accounting for only 6%. The last column in Table 5.4.16
shows the component matrix when only one factor was
specified. Taken together, the data suggests that one factor
would be the simplest explanation for the data structure.
Summary
All in all, these results show the CUE-Plus to have very
small differences for both gender and race. To the extent
that the instrument predicts scores on the test battery,
test differences are not likely to be attributable to computer familiarity.
RELATIONSHIP BETWEEN CUE-PLUS

AND PREDICTOR SCORES
Correlations
An argument could be made that ones familiarity
with and use of computers could influence scores on the
computerized predictor battery. To address that question, correlations between the individual CUE-Plus
items and the CUE-Plus total score with the AT-SAT
predictor scores were computed. One area of interest is
to what extent computer familiarity will affect the scores
of applicants. To better examine the data in this light, the
sample was separated into controllers and pseudo-applicants and separate correlations performed for the two
groups. The correlations for the controller sample are
shown in Tables 5.4.18 and 5.4.19. Table 5.4.18 shows
the correlations between the CUE items and Applied
Math, Angles, Air Traffic Scenarios, Analogy, Dials, and
Scan scores. Table 5.4.19 shows the correlations between
CUE-Plus and Letter Factory, Memory, Memory Recall, Planes, Sounds and Time-Wall (TW) scores. Tables
5.4.20 and 5.4.21 contain the same information for the
pseudo-applicant sample. In general, the CUE-Plus scores
were more highly correlated with performance on the
AT-SAT battery for the pseudo-applicants than for the
controllers.
The CUE-Plus total score was correlated (p < .05 or
p < .01) with all predictor scores with the exception of
those for Analogy: Latency and Time-Wall: Perceptual
Speed for the pseudo-applicants. The same was true for
the controller sample with regard to Air Traffic Scenarios: Accuracy, Memory: Number Correct, Recall:
Number Correct, Planes: Projection and Planes: Time
Sharing. Given the widespread use of computers at work
and school and the use of Internet services this rate of
correlation is not surprising.
PERFORMANCE DIFFERENCES
Gender Differences
The overall mean for the CUE-Plus was 51.31 (SD
= 16.09). To see whether males performed significantly
different than females on the CUE-Plus, difference
scores were computed for the different samples. The
difference score (d) is the standardized mean difference
between males and females. A positive value indicates
superior performance by males. The results are reported
in Table 5.4.16. For all samples, males scored higher on
the CUE (i.e., were more familiar with or used computers for a wider range of activities), but at most, these
differences were only moderate (.04 to .42).
Ethnic Differences
Performance differences on the CUE-Plus between
ethnic groups were also investigated. The means, standard deviations and difference scores (d) for each group
is presented in Table 5.4.17. The table is split out by
sample type (e.g., Controller, Military PA, Civilian PA).
Comparisons were conducted between Caucasians and
three comparison groups: African-Americans, Hispanics, and all non-Caucasian participants. A positive value
indicates superior performance by Caucasians; a negative value indicates superior performance by the com-
33
#:407
The Letter Factory test scores on Situational Awareness and Planning and Thinking Ahead are highly correlated with the individual CUE-Plus items for the
pseudo-applicants, while the controllers Planning and
Thinking Ahead scores were more often correlated with
the CUE-Plus items than were their Awareness scores.
One explanation for these high correlations is that the
more comfortable one is with various aspects of using a
computer, the more cognitive resources can allocated for
planning. When the use of the computer is automatic,
more concentration can be focused on the specific task.
The Time-Wall perception scores (Time Estimate
Accuracy and Perceptual Accuracy) are highly correlated
with the individual CUE items for the pseudo-applicants and correlated to a lesser extent for the controllers.
The reverse is true for the Perceptual Speed variable: the
controller scores are almost all highly correlated with
CUE-Plus items, while only two of the items are correlated for the pseudo-applicants. The Time-Wall test will
not be included in the final test battery, so this is not a
consideration as far as fairness is concerned.
Using a mainframe computer correlated with only
one of the test battery scores for the controller sample,
but correlated highly with several test scores for the
pseudo-applicants. The fact that controllers use mainframes in their work probably had an effect on their
correlations.
best predictor of performance. Negative b weights for

gender indicate that males performed better than females. The positive weights for age indicate that the older
the individual, the higher their score on the Applied
Math test. Education and CUE-Plus score were also
positively weighted, indicating that the more education
one received and the more familiar one is with computers, the better one is likely to do on the Applied Math test.
Caucasian participants scored higher than did their
comparison groups. The statistics for each variable entered are shown in Table 5.4.22.
Angles Test
The same general pattern of results holds true for the
Angles test. Table 5.4.23 shows the statistics for each
variable. Age was not a predictor of performance for this
test in any of the comparisons. The other variables were
predictive for the Caucasian/African-American and the
Caucasian/Minority models. Race was not a predictor
for the Caucasian/Hispanic model. In all cases, females
performed less well than males. Amount of education
and CUE-Plus were positive indicators of performance.
The predictor sets accounted for about 10% of the
variance in Angles test scores;, the CUE-Plus score
contributed little to explaining the variance in scores.
The predictor variables accounted for between 15%
and 20% of the variance in the Efficiency scores (see
Table 5.4.24), but only about 3% for Safety (Table
5.4.25) and 7% for Procedural Accuracy (Table 5.4.26).
CUE-Plus scores were predictive of performance for all
three variables, but not particularly strongly. Age was a
positive predictor of performance for only the Procedural Accuracy variable. Gender was a predictor for
Efficiency in all three models, but not consistently for
the other two variables. Education predicted only Procedural Accuracy. Race was not a predictor for the Caucasian/Hispanic models, although it was for the other
models.
Regression Analyses
Regression analyses were conducted to investigate the
extent to which the CUE-Plus and four demographic
variables predict test performance. The dependent variables predicted were the measures that are used in the test
battery. Dummy variables for race were calculated, one
to compare Caucasians and African-Americans, one to
compare Hispanics to Caucasians, and the third to
compare all minorities to Caucasians. Those identified
as Caucasian were coded as 1, members of the comparison groups were coded as 0. 1,497 cases were analyzed.
Thus, five variables were used in the regression analyses:
three race variables, education, age, gender and score
on CUE-Plus.
Analogy Test
Age was a fairly consistent predictor for the Information Processing (see Table 5.4.27) and Reasoning variables (see Table 5.4.28), although it did not predict
Reasoning performance in the Caucasian/Minority and
Caucasian/African-American equations. Education was
a negative predictor for Information Processing, but was
positively related to Reasoning. CUE-Plus was a predic-
Applied Math
The variables described above were entered as predictors for the total number of items correct. For all three
comparisons, all variables were included in the final
model. That model accounted for approximately 20% of
the variance for all three comparisons. Gender was the
34
#:408
tor for Reasoning, but not for Information Processing.

Together, the independent variables accounted for about
11% of the variance in the Information Processing scores
and about 16% of the Reasoning scores.
controller sample. The correlations were higher for the

pseudo-applicant sample. To further investigate these
relationships, regression analyses were conducted to see
how well Cue-Plus and other relevant demographic
variables predicted performance on the variables that
were used in the V 1.0 test battery.
The results showed that overall, the demographic
variables were not strong predictors of test performance.
The variables accounted for relatively little of the variance in the test scores. CUE-Plus was identified as a
predictor for nine of the eleven test scores. However,
even for the scores where CUE-Plus was the strongest
predictor of the variables entered, it accounted for no
more than 8% of the variance in the score. In most of the
scores, the effect, although statistically significant, was
realistically negligible.
Dials Test
The number of items correct on the Dials test was
predicted by gender, education, race and CUE-Plus.
Table 5.4.29 shows the statistics associated with the
analysis. Males are predicted to score higher than females; those with higher education are predicted to
perform better on the test than those with less education.
Race was positively related with Dials scores, indicating
that Caucasians tended to score higher than their comparison groups. CUE-Plus was a significant, but weak
predictor for the Caucasian/Minority and Caucasian/
African-American models. It did not predict performance in the Caucasian/Hispanic model. The four
variables accounted for between 8% and 10% of the
variance in Dials test performance.
SUMMARY
This chapter described the participants in the ATSAT validation study. The participants represented both
genders and the U.S. ethnicities likely to form the pool
of applicants for the Air Traffic Controller position.
In addition to describing the demographic characteristics of the sample on which the test battery was validated, this chapter also described a measure of computer
familiarity, CUE. CUE was developed by Potosky and
Bobko (1997) and revised for this effort (CUE-Plus).
The CUE-Plus is a highly reliable scale (alpha = .92);
factor analysis indicated that there was only one interpretable factor. Analysis of the effect of gender on CUEPlus scored showed moderate differences for the controller
sample, none for the pseudo-applicant sample; males
scored higher on the CUE-Plus than did females. There
were also small to moderate differences in CUE-Plus for
ethnicity. The strongest differences were found in the
military pseudo-applicant sample.
CUE-Plus items showed a moderate to high correlation with the variables assessed in the validation study.
The CUE-Plus was also shown to be a fairly weak but
consistent predictor of performance on the variables that
were included in V 1.0 test battery. Although there were
some performance differences attributable to gender,
race and computer experience none of these were extremely strong. The effects of computer skill would be
washed out by recruiting individuals who have strong
computer skills.
Letter Factory Test

The Letter Factory test had two scores of interest:
Situational Awareness and Planning and Thinking Ahead.
Age and gender did not predict for either score. Race and
CUE-Plus score were predictors for both variables; education was a predictor for Situational Awareness. These
variables accounted for between 7% and 12% of the
variance in the Situational Awareness score (see Table
5.4.30) and 11% to 15% of the variance in the Planning
and Thinking Ahead score (see Table 5.4.31).
Scan Test
The variables in the regression equation accounted for
only 1% to 3% of the variance in the Scan score (see
Table 5.4.32). Education was a positive predictor for all
three equations. Race was a predictor for the Caucasian/
African-American model. CUE-Plus score positively predicted performance in the Caucasian/Hispanic equation.
Summary
The question of interest in this section has been the
extent to which computer familiarity, as measured by
CUE-Plus, influences performance on the AT-SAT test
battery. The correlation matrices indicated a low to
moderate level of relationship between CUE-Plus and
many of the variables in the pilot test battery for the
35
#:409
CHAPTER 5.5
PREDICTOR-CRITERION A NALYSES
Gordon Waugh, HumRRO
Overview of the Predictor-Criterion Validity
Analyses
The main purpose of the validity analyses was to
determine the relationship of AT-SAT test scores to air
traffic controller job performance. Additional goals of
the project included selecting tests for the final AT-SAT
battery, identifying a reasonable cut score, and the
development of an approach to combine the various ATSAT scores into a single final score. Several steps were
performed during the validity analyses:
Based on the analyses of the dimensions underlying

the criteria, it was concluded that the criteria space could
be summarized with four scores: (a) the CBPM score, (b)
a single composite score of the 10 Behavior Summary
Scales (computed as the mean of the 10 scales), (c) HiFi
1: Core Technical score (a composite of several scores)
and (d) HiFi 2: Controlling Traffic Safely and Efficiently (a composite of several scores). The small sample
size for the HiFi measures precluded their use in the
selection of a final predictor battery and computation of
the predictor composite. They were used, however, in
some of the final validity analyses as a comparison
standard for the other criteria.
A single, composite criterion was computed using the
CBPM score and the composite Ratings score. Thus, the
following three criteria were used for the validity analyses: (a) the CBPM score, (b) the composite Ratings
score, and (c) the composite criterion score.
Select the criteria for validation analyses

Compute zero-order validities for each predictor score
and test
Compute incremental validities for each test
Determine the best combination of tests to include in
the final battery
Determine how to weight the test scores and compute
the predictor composite score
Compute the validity coefficients for the predictor
composite
Correct the validity coefficient for statistical artifacts
Zero-Order Validities
It is important to know how closely each predictor
score was related to job performance. Only the predictor
scores related to the criteria are useful for predicting job
performance. In addition, it is often wise to exclude tests
from a test battery if their scores are only slightly related
to the criteria. A shorter test battery is cheaper to develop,
maintain, and administer and is more enjoyable for the
examinees.
Therefore, the zero-order correlation was computed
between each predictor score and each of the three
criteria (CBPM, Ratings, and Composite). Because some
tests produced more than one score, the multiple correlation of each criterion with the set of scores for each
multi-measure test was also computed. This allowed the
assessment of the relationship between each test, as a
whole, and the criteria. These correlations are shown in
Table 5.5.1 below.
Ideally, we would like to know the correlation between the predictors and the criteria among job applicants. In this study, however, we did not have criteria
information for the applicants (we did not actually use
real applicants but rather pseudo-applicants). That would
require a predictive study design. The current study uses
Many criterion scores were computed during the

project. It was impractical to use all of these scores during
the validation analyses. Therefore, a few of these scores
had to be selected to use for validation purposes. The
three types of criterion measures used in the project were
the CBPM (Computer-Based Performance Measure),
the Behavior Summary Scales (which are also called
Ratings in this chapter), and the HiFi (High Fidelity
Performance Measure). The development, dimensionality, and construct validity of the criteria are discussed at
length in Chapter 4 of this report.
The CBPM was a medium fidelity simulation. A
computer displayed a simulated air space sector while the
examinee answered questions based on the air traffic
scenario shown. The Behavior Summary Scales were
performance ratings completed by the examinees peers
and supervisors. The HiFi scores were based upon
observers comprehensive ratings of the examinees
two-day performance on a high-fidelity air traffic
control simulator.
37
#:410
a concurrent design: We computed the predictor-criteria

correlations using current controllers. Correlations are
affected by the amount of variation in the scores. Scales
with little variation among the scores tend to have low
correlations with other scales. In this study, the variation
in the predictor scores was much greater among the
pseudo-applicants than among the controllers. Therefore, we would expect the correlations to be higher
within the pseudo-applicant sample. A statistical formula, called correction for range restriction, was used to
estimate what these correlations would be among the
pseudo-applicants. The formula requires three values:
(a) the uncorrected correlation, (b) the predictors standard deviation for the pseudo-applicant sample, and (c) the
predictors standard deviation for the controller sample.
Table 5.5.1 shows both the corrected and uncorrected
correlations. The amount of correction varies among the
predictors because the ratio of the pseudo-applicant vs.
controller standard deviations also varies. The greatest
correction occurs for predictors which exhibit the greatest differences in standard deviation between the two
samples (e.g., Applied Math). The least correction (or
even downward correction) occurs for predictors whose
standard deviation differs little between the two samples
(e.g., the EQ scales).
Table 5.5.1 shows that most of the tests exhibit
moderate to high correlations with the CBPM and low
to moderate correlations with the Ratings. Some scales,
however had no significant (p < .05) correlations with
the criteria: the Information Processing Latency scale
from the Analogies test and 2 of the 14 scales from the
Experiences Questionnaire (Tolerance for High Intensity
and Taking Charge). In addition, these two EQ scales
along with the EQ scale, Working Cooperatively, correlated negatively with the CBPM and composite criteria.
Thus, it is doubtful that these scores would be very useful
in predicting job performance. Analyses of their incremental validities, discussed below, confirmed that
these scores do not significantly improve the prediction of the criteria.
The EQ (Experiences Questionnaire) is a self-report
personality inventory. It is not surprising, then, that its
scales do not perform as well as the other testswhich
are all cognitive measuresin predicting the CBPM
which is largely a cognitive measure. The cognitive tests
were generally on a par with the EQ in predicting the
Ratings criterion. A notable exception was the Applied
Math test, which greatly outperformed all other tests in
predicting either the CBPM or the Ratings. Note that the
Ratings criterion is a unit-weighted composite of the 10

behavior summary scales completed by supervisors. The
EQ correlated quite highly with a number of these
behavior summary scales, e.g., the four scales making up
the Technical Effort factor, and the single scale in the
teamwork factor, but not very highly with the composite
Ratings criterion.
Composure and Concentration are the only EQ scales
that correlate above .08 with the CBPM, whereas eight
scales correlate this highly with the Ratings. This is not
surprising because both personality measures and performance ratings incorporate non-cognitive performance
tors such as motivation. The moderate size of the multiple correlation of the EQ with the CBPM of .16 is
misleadingly high because three of the EQ scales correlate negatively with the CBPM. The size of a multiple
correlation is usually just as large when some of the
correlations are negative as when all are positive. Scales
that correlate negatively with the criterion, however,
should not be used in a test battery. Otherwise, examinees scoring higher on these scales would get lower scores
on the battery. When the three scales that correlate
negatively with the CBPM are excluded, the EQ has a
multiple correlation of only .10 (corrected for shrinkage)
with the CBPM.
Incremental Validities
At this point, all the scoresexcept for the Information Processing score from the Analogies test and 7 of the
14 scores from the Experiences Questionnairehave
demonstrated that they are related to the criteria. The
next step was to determine which scales have a unique
contribution in predicting the criteria. That is, some
scales might not add anything to the prediction because
they are predicting the same aspects of the criteria as
some other scales.
If two tests predict the same aspects of the criteria then
they are redundant. Only one of the tests is needed. The
amount of the unique contribution that a test makes
toward predicting a criterion is called incremental validity. More precisely, the incremental validity of a test is the
increase in the validity of the test battery (i.e., multiple
correlation of the criterion with the predictors) when
that test is added to a battery.
Table 5.5.2 shows the incremental validities for each
test and scale. There are two values for most tests. The
first value shows the incremental validity when the test
is added to a battery that contains all the other tests; the
other value shows the incremental validity when the test
38
#:411
is added to only the tests in the final AT-SAT battery. In

addition, incremental validities for the final version of
the EQ test (in which three of the original EQ scales were
dropped) are shown.
Three tests have a substantial unique contribution
to the prediction of the criteria. Each has an incremental validity greater that .10 (corrected for shrinkage but not for range restriction). They are, in order of
decreasing incremental validity, Applied Math, EQ,
and Air Traffic Scenarios.
computer program was written (using Visual Basic)

which essentially considered all these parameters simultaneously. In choosing the set of optimal scale weights,
the program considered the following sets of parameters
of the resulting predictor composite: overall validity,
differences in group means, differences in the groups
regression slopes, and differences in the groups intercepts. There were three parameters for each type of group
difference: females vs. males, blacks vs. whites, Hispanics
vs. whites. One final feature of the program is that it
would not allow negative weights. That is, if a scales
computed weight was such that a high score on the scale
would lower the score on the overall score then the scales
weight was set to zero.
Several computer runs were made. For each run, the
relative importance of the parameters were varied. The
goal was to maximize the overall validity while minimizing group differences. In the end, the group difference
with the greatest effect on the overall validity was the
black vs. white group mean on the composite predictor.
Thus, the ultimate goal became to reduce the differences
between the black and white means without reducing the
maximum overall validity by a statistically significant amount.
There were only nine scales remaining with non-zero
weights after this process. This low number of scales was
undesirable. It is possible that some of the excluded tests
might perform better in a future predictive validity study
than in the concurrent study. If these tests are excluded
from the battery, then there will be no data on them for
the predictive validity study. Another limitation of this
technique is that the weights will change, possibly substantially, if applied to another sample.
Therefore, a combination of the validity weighting
and optimal weighting schemes was used. For each scale,
the weight used was the mean of the optimal and validity
weights. A description of the computation of the validity
and optimal weights follows.
The computation of the validity weights for a singlescale test was straightforward. It was merely the correlation, corrected for range restriction, of the scale with the
composite criterion. The computation for the multiscale tests was somewhat more complex. First, the multiple correlation, corrected for range restriction, of the
test with the composite criterion was computed. This
represents contribution of the test to the composite
predictor. Then, the correlations of each of the tests
scales with the composite criterion, corrected for range
restriction, were computed. The validity weights of the
scales were computed according to the following formula:
Determination of Scale Weights for the Test Battery

The full AT-SAT battery would require more than a
day of testing time. Thus, it was desired to drop some of
the tests for this reason alone. Therefore, several tests
were excluded from the final test battery taking into
consideration the following goals:
1. Maintain high concurrent validity.
2. Limit the test administration time to a reasonable
amount.
3. Reduce differences between gender/racial group
means.
4. No significant differences in prediction equations
(i.e., regression slopes or intercepts) favoring males or
whites (i.e., no unfairness).
5. Retain enough tests to allow the possibility of increasing the predictive validity as data becomes available
in the future.
There are typically three main types of weighting
schemes: regression weighting, unit weighting, and validity weighting. In regression weighting, the scales are
weighted to maximize the validity of the predictor composite in the sample of examinees. The main problem
with this scheme is that the validity drops when the
predictor weights are used in the population. Unit weighting gives equal weight to each scale or test. It tends to
sacrifice some sample validity, but its validity does not
typically drop in the population because the weights are
chosen independent of the sample. Validity weighting
assigns each scales simple validity as its weight. This
scheme is a compromise between the two methods.
Validity weights do almost as well as regression weights
in the sample. More importantly, validity weights are less
sensitive to differences in samples than regression weights.
The large numbers of scales and parameters to consider for each scale made it difficult to subjectively decide
which tests to drop. For each scale, ten parameters were
relevant to this decision. To aid in this decision, a
39
#:412
wi = R
ri
raw composite predictor = wi zi [ E q u a t i o n
[Equation 5.5.1]
j =1
where k = the number of predictors, wi = the rescaled

combined weight of the ith predictor, and zi = the zscore of the ith predictor. In other words, the raw
composite predictor score is the weighted sum of the zscores. This score was rescaled such that a score of 70
represented the cut score and 100 represented the
maximum possible score. This is the scaled AT-SAT
battery score. The determination of the cut score is
described later in this chapter. To simplify the
programming of the software that would administer and
score the AT-SAT battery, a set of weights was computed
that could be applied to the raw predictor scores to
obtain the scaled AT-SAT battery score. Thus the scaled
AT-SAT battery score was computed according to the
following formula:
where wi = validity weight of scale i, ri = correlation of the

predictor scale with the criterion, R = multiple correlation
of the test with the criterion, rj = the correlation with the
criterion of the scale j of the k scales within the test. All
correlations were corrected for range restriction.
The validity weights and optimal weights had to be
put on a common metric before they could be combined.
Each validity weight was multiplied by a constant such
that all the weights summed to 1.00. Similarly, each
optimal weight was multiplied by a constant such that all
the weights summed to 1.00. Each predictors combined
weight was then computed as the mean of its rescaled
optimal and validity weights. Finally, the combined
weight was rescaled in the same manner as the validity
and optimal weights. That is, each combined weight was
multiplied by a constant such that all the weights summed
to 1.00. This rescaling was done to aid interpretation of
the weights. Each weight represents a predictors relative
contribution, expressed as a proportion, to the predictor
composite.
Scaled AT-SAT Battery Score = wi xi

i =1
Equation 5.5.4]
where k = the number of predictors, wi = the raw-score
weight of the ith predictor, and xi = the raw score of the
ith predictor.
The effects of using various weighting schemes are
shown in Table 5.5.3. The table shows the validities both
before and after correcting for shrinkage and range
restriction. Because the regression procedure fits an
equation to a specific sample of participants, a drop in
the validity is likely when the composite predictor is used
in the population. The amount of the drop increases as
sample size decreases or the number of predictors increases. The correction for shrinkage attempts to estimate the amount of this drop. The formula used to
estimate the validity corrected for shrinkage is referred to
by Carter (1979) as Wherry (B) (Wherry, 1940). The
formula is :
Predictor Composite
The predictor composite was computed using the
combined predictor weights described above. Before
applying the weights, the predictor scores had to be
transformed to a common metric. Thus, each predictor
was standardized according to the pseudo-applicant
sample. That is, a predictors transformed score was computed as a z-score according to the following formula:
z=
x p
p
5.5.3]
i =1
[Equation 5.5.2]
where z = the predictors z-score, x = the raw predictor

score, = the predictors mean score in the pseudop
applicant sample, and = the predictors standard
p
deviation in the pseudo-applicant sample (i.e., the estimate
of the predictors standard deviation in the population
based on the pseudo-applicant sample data).
The predictor composite was then computed by applying the rescaled combined weights to the predictor zscores. That is, the predictor composite was computed
according to the following formula:
n 1
R = 1 (1 R 2 )
n k 1
[Equation 5.5.5]
where R = the validity corrected for shrinkage, R is the

uncorrected validity, n = the sample size, and k = the
number of predictors. Where validities were corrected
for both range restriction and shrinkage, the shrinkage
correction was performed first.
40
#:413
As noted above, the final AT-SAT score was computed using the Combined method of weighting the
predictors. Only the regression method had a higher
validity. In fact, the Combined method probably has a
higher validity if we consider that its correction for
shrinkage overcorrects to some extent. Finally, the regression-weighted validity is based on all 35 scales
whereas the Combined validity is based on just 26
tests. Thus, the Combined weighting method produces the best validity results.
The Combined method produced the second-best
results in terms of mean group differences and fairness.
Only the Optimal low d-score weighting method had
better results in these areas, and its validity was much
lower than the Combined methods validity. None of the
weighting methods produced a statistically significant
difference in standardized regression slopes among the
groups. Thus, the Combined weighting method was the
best overall. It had the highest validity and the secondbest results in terms of group differences and fairness.
Therefore, the Combined weighting method was used to
compute the final AT-SAT battery score.
CBPM only once. Previous research has found that

similar measures (i.e., situational judgement tests) have
test-retest reliabilities of about .80, with most in the
range between .7-.9. Thus, three different reliabilities
were used to correct the CBPMs validity for unreliability:
.8 (best guess), .9 (upper bound estimate), and .7 (lower
bound estimate), respectively. The reliability of the
composite measure could not be directly measured.
Therefore, an approximation of the composite criterion
reliability was computed as the mean of the ratings and
CBPM reliabilities.
Determining the Cut Score
One of the specifications for the AT-SAT battery was
that a score of 70 would represent the cut score and a
score of 100 would represent the highest possible score.
The cut score and maximum score were first determined
on the AT-SAT batterys original scale. Then these two
scores were transformed to scores of 70 and 100 on the
scaled AT-SAT battery scale.
The determination of the highest possible score was
relatively straightforward. There was, however, one complication. The maximum possible scores for the simulation scales (i.e., Letter Factory scales, Air Traffic Scenarios
scales) and some of the other scales (e.g., Analogies
information processing scores) were unknown. Thus,
the determination of the highest possible score was not
simply a matter of adding up the maximum scores
possible for each scale. For the scales with an unknown
maximum possible score, the maximum scores attained
during the study were used to estimate the highest scores
likely to be attained on these scales in the future.
The determination of the cut score was more involved. The main goal in setting the cut score was to at
least maintain the current level of job performance in the
controller workforce. After examining the effects of
various possible cut scores on controller performance, a
cut score was selected that would slightly improve the job
performance of the overall controller workforce. Specifically, the cut score was set such that the mean predicted
criterion score, among pseudo-applicants passing the
battery, was at the 56th percentile of the current controller distribution of criterion scores.
Table 5.5.5 shows the effects of this cut score on
selection rates and predicted job performance. If all the
pseudo-applicants were hired, their mean job performance would be at only the 33rd percentile of the current
controller distribution. Thus, using the AT-SAT Battery, with the chosen cut score, is considerably better
Final AT-SAT Battery Validity

The best estimate of the validity of the AT-SAT
battery is .76. This value is extremely high. Table 5.5.4
shows the validity of the AT-SAT battery for various
criteria. The table also shows how various statistical
corrections affect the validity estimate. The most relevant validity of .76 is the correlation with the composite
criterion which is corrected for range restriction, shrinkage, and criterion unreliability.
The low sample size for the high fidelity criteria
precludes accurate estimates of validity. The purpose of
the high-fidelity criteria was to obtain independent
evidence that the CBPM and Ratings were related to job
performance. As shown in a previous chapter, the high
correlations of the CBPM and Ratings with the high
fidelity criteria are strong evidence that the CBPM and
Ratings are accurate indicators of job performance.
Interrater agreement reliability was used to correct the
validities for the Ratings and HiFi criteria. Reliability for
the CBPM was estimated by computing its internal
consistency (coefficient alpha = .59), but this figure is
probably an underestimate because the CBPM appears
to be multidimensional (according to factor analyses).
Ideally, the reliability for the CBPM should be computed as a test-retest correlation. This could not be
computed, however, because each examinee took the
41
#:414
than using no screening. That is, if all of the pseudoapplicants were hired (or some were randomly selected to
be hired), their performance level would be much lower
than the current Controllers.
select applicants much above the mean of current controllers. In the past, of course, the OPM test was combined with a nine-week screening program resulting in
current controller performance levels. The AT-SAT is
expected to achieve about this same level of selectivity
through the pre-hire screening alone.
Table 5.5.6 shows the percent of high performers
expected for different cutpoints on the AT-SAT and
OPM batteries. This same information is shown graphically in Figure 5.5.2. Here, high performance is defined
as the upper third of the distribution of performance in
the current workforce as measured by our composite
criterion measure. If all applicants scoring 70 or above on
the AT-SAT are selected, slightly over one-third would
be expected to be high performers. With slightly greater
selectivity, taking only applicants scoring 75.1 or above,
the proportion of high performers could be increased to
nearly half. With a cutscore of 70, it should be necessary
to test about 5 applicants to find each hire. At a cutscore
of 75.1, the number of applicants tested per hire goes up
to about 10. By comparison, 1,376 applicants would
have to be tested for each hire to obtain exactly one-third
high performers using the OPM screen.
Impact of AT-SAT on Workforce Capabilities

Figure 5.5.1 shows the relationship between scores on
the AT-SAT battery and the expected or average performance of examinees at each score level. For comparison
purposes, the previous OPM battery, which had a (generously corrected) validity of about .30 has been placed
on the same scale as the AT-SAT composite. The primary point is that applicants who score very high (at 90)
on the AT-SAT are expected to perform near the top of
the distribution of current controllers (at the 86th percentile). Applicants who score very high (at 90) on the OPM
test, however, are expected to perform only at the middle
of the distribution of current controllers (at the 50th
percentile). Only 1 out of 147 applicants would be
expected to get an OPM score this high (90 or above).
Someone with an OPM score of 100 would be expected
to perform at the 58th percentile. Consequently, there is
no way that the OPM test, by itself, could be used to
42
#:415
CHAPTER 5.6
ANALYSES OOF GROUP DIFFERENCES
AND
FAIRNESS
Gordon Waugh, HumRRO

SUMMARY
The group means on the composite predictor for
females, blacks, and Hispanics were significantly lower
than the means for the relevant reference groups (males,
whites). The difference was greatest for blacks. The
cognitive tests displayed much greater differences than
did the EQ scales. However, the EQ scales had much
lower validity as well. Although the predictor composite
exhibited lower group means for minorities, no evidence
of unfairness was found. In fact, the composite predictor
over-predicted the performance of all three minority
groups (females, blacks, and Hispanics) at the cut score.
The validity coefficients and regression slopes were remarkably similar among the groups. Among the individual test scales, there were no cases (out of a possible
111) in which the slopes of the regression lines differed
significantly between a minority and reference group. These
results show that the test battery is fair for all groups.
analyses: male vs. female, white vs. black, and white

vs. Hispanic. The descriptive statistics for the predictors and criteria are shown in Tables 5.6.15.6.3.
Cut Scores
Both the analyses of sub-group differences and fairness required a cut score (i.e., a specified passing score)
for each test and for the predictor composite score.
Therefore, hypothetical cut scores had to be determined.
The cut score on the predictor composite was set at the
32nd percentile on the controller distribution. (This score
was at the 78th percentile on the pseudo-applicant distribution.) Thus, the hypothetical cut score for each test
was also set at the 32nd percentile on the controller
distribution for the purposes of the fairness and group
mean difference analyses. The determination of the cut
score is discussed elsewhere in this report. Regression
analyses predicted that the mean level of job performance
for applicants passing the AT-SAT battery would be at
the 56th percentile of the job performance of current
controllers. That is, it is predicted that applicants passing
the battery will perform slightly better than current
controllers.
INTRODUCTION
A personnel selection test may result in differences
between white and minority groups. In order to continue
to use a test that has this result, it is required to demonstrate that the test is job- related or valid. Two types of
statistical analyses are commonly used to assess this issue.
The analysis of mean group differences determines the
degree to which test scores differ for a minority group as
a whole (e.g., females, blacks, Hispanics) when compared with its reference group (i.e., usually whites or
males). Fairness analysis determines the extent to which
the relationship between test scores and job performance differs for a minority group compared to its
reference group.
Our sample contained enough blacks and Hispanics to analyze these groups separately but too few
members of other minority groups to include in the
analyses. It was decided not to run additional analyses
with either all minorities combined or with blacks and
Hispanics combined because the results differed considerably for blacks vs. Hispanics. Thus, the following
pairs of comparison groups were used in the fairness
Estimation of Missing Values

There were few blacks in the controller (n = 98) and
pseudo-applicant samples (n = 62). In addition, there
were even fewer in the analyses because of missing values
on some tests. When the composite predictor was computed, missing values on the individual scales were
estimated. Otherwise, a participant would have received
a missing value on the composite if any of his/her test
scores were missing. Each missing score was estimated
using a regression equation. The regression used the
variable with the missing score as the dependent variable
and the scale that best predicted the missing score as the
independent variable. The predictor scale had to be from
a different test than the missing score. For example, if an
examinees Applied Math score was missing then his/her
Angles score was used to estimate it. If both the Applied
Math and Angles scores were missing, then the estimated
43
#:416
composite predictor score would also be missing. Each

missing EQ score, however, was predicted using another
EQ scale. Missing scores were estimated only when
building the composite predictor. That is, missing values
were not estimated for analyses that used the individual
test scores. This was judged to be a conservative estimation procedure because (a) only one independent variable was used in each estimation regression (b) none of
the blacks and few of the other examinees were missing
more than one test score, and (c) each test score contributed only a small amount to the final composite predictor score. The amount of error caused by the estimation
of missing values is very likely to be trivial. To ensure that
the covariances were not artificially increased by the
estimation of missing values, random error was added to
each estimated value.
Smaller differences in selection rate may nevertheless

constitute adverse impact, where they are significant in
both statistical and practical terms . . .
Therefore, the differences in the passing rates were
tested for statistical significance using 2 2 chi-square
tests of association. For each predictor score, one chisquare analysis was done for each of the following pairs
of groups: male-female, white-black, and white-Hispanic. An example is shown in Table 5.6.4 below. This
shows how the chi-square test was computed which
compared male and female passing rates.
The groups were also compared by computing the
mean test score for each group. The differences in the
means between the minority groups and reference groups
(i.e., males or whites) were then tested for statistical
significance using independent-groups t-tests. The differences between the means were then converted to dscores which express these differences in terms of standard
deviation units based on the reference groups standard
deviation. For example, a d-score of .48 for females
indicates that the mean female score is .48 standard
deviations below the mean of the male distribution of
scores (i.e., at the 32nd percentile of the male distribution according to a table of the normal distribution).
GROUP DIFFERENCES
Analyses
Only the pseudo-applicant sample was used for the
group difference analyses. This sample best represented
the population of applicants. Therefore, air traffic controllers were excluded from these analyses.
The Uniform Guidelines on Employee Selection Procedures (Federal Register, 1978, Section 4.D.) state that
evidence of adverse impact exists when the passing rate
for any group is less than four-fifths of the passing rate for
the highest group:
Results and Conclusions

Table 5.6.5 shows the results for the passing rate
analyses. Several testsincluding the predictor compositeexhibited evidence of group differences for females,
blacks, and Hispanics according to the four-fifths rule.
In most of these cases, the difference in passing rates was
statistically significant. Females and Hispanics had similar passing rates; blacks had by far the lowest passing
rates.
Table 5.6.5 also shows the differences between the
group means expressed as d-scores. The significant dscores are asterisked in the table. These results were very
similar to those for the passing rates. The group
predictor combinations that had significantly lower passing scores (compared to the reference group) also tended
to have significantly lower d-scores. All three minority
groups tended to score below their reference groups, but
the differences were often not statistically significant.
Blacks scored lowest on most tests. On the composite
predictor, Hispanics had the highest d-score, followed
by females and blacks, respectively. The Hispanic dscore was not statistically significant.
The group differences for the EQ scales were much
lower than for the cognitive tests. (The Memory Test and
the Memory Retest, however, had very small group
A selection rate for any race, sex, or ethnic group which

is less than four-fifths (4/5) (or eighty percent) of the rate for
the group with the highest rate will generally be regarded
by the Federal enforcement agencies as evidence of adverse
impact, while a greater than four-fifths rate will generally
not be regarded by Federal enforcement agencies as evidence of adverse impact.
Therefore, the passing rates for each test were computed for all five groups (males, females, whites, blacks,
Hispanics). Then the passing rates among the groups
were compared to see if the ratio of the passing rates fell
below four-fifths. Separate comparisons were done within
the gender groups and within the racial groups. That is,
males and females were compared; and blacks and Hispanics were compared to whites.
The Uniform Guidelines (Section D.4.) state that
adverse impact might exist even if the passing rate for the
minority group is greater than four-fifths the reference
groups passing rate:
44
#:417
differences. In fact, females did better than males on

these two tests.) For example, for blacks, the median dscore was .48 among the 23 cognitive scores but only
.20 among the 14 EQ scales. However, the EQ scales also
had much lower validity than did the other tests. This is
probably why the passing rates are much higher for the
EQ. In fact, the passing rates on half of the EQ scales
were higher for the pseudo-applicants than for the controllers (i.e., half of the passing rates were higher than 68%,
which is the passing rate for each test in the controller
sample). In all the other tests, the passing rate was much
lower for the pseudo-applicants than for the controllers.
There are two possible reasons for the high passing
rates for the EQ scales: (a) the pseudo-applicants and
current controllers possess nearly the same levels of the
personality traits supposedly measured by the EQ or (b)
the EQ scales are measuring some unwanted constructs
(probably in addition to the traits that the scales were
designed to measure). If the first possibility is true, then
one must conclude that either these traits are not really
needed on the job or that the current controllers would
perform even better on the job if they improved in these
traits. If the second possibility is true, then some unwanted constructs, such as social desirability, are being
measured to some degree by the EQ scales.
In conclusion, the predictor composite for the final
AT-SAT battery exhibited lower scores for all three
minority groups (i.e., females, blacks, and Hispanics)
compared to their reference groups (i.e., males and
whites) in terms of both passing rates and d-scores. All of
these differences, except for the Hispanic d-score, were
statistically significant. The relative passing rates on the
predictor composite for females, blacks, and Hispanics
(compared to the passing rates for the reference groups:
males and whites) were .54, .11, and .46, respectively.
Thus, there was evidence of sub-group differences in test
performance for the three minority groups.
It should be noted that subgroup differences in predictor scores do not necessarily imply bias or unfairness.
If low test scores are associated with low criterion performance and high test scores are related to high criterion
performance, the test is valid and fair. The fairness issue
is discussed below.
controller sample. A test is considered fair when the

relationship between the predictor test and job performance is the same for all groups. In our analyses, only
differences that aid whites or males were considered to be
unfair. Fairness is assessed by performing regression
analyses using the test score as the independent variable
and the criterion measure as the dependent variable. To
assess the fairness of a predictor for females, for example,
two regressions are performed: one for males and one for
females. In theory, the predictor is considered to be fair
if the male and female regression lines are identical. In
practice, the test is considered to be fair if the difference
between the equations of the two regression lines is not
statistically significant (given a reasonable amount of power).
The equations of the two regression lines (e.g., male
vs. female regression lines) can differ in their slopes or
their intercepts. If the slopes differ significantly then the
predictor is not fair. If the slopes do not differ significantly, then the intercepts are examined. In this study, to
maximize interpretability, the predictor scores were scaled
such that all the intercepts occurred at the cut point (i.e.,
passing score). Specifically, the cut score was subtracted
from the predictor score.
Although fairness analysis is based on a separate
regression line for each of the two groups being compared, a quicker method uses a single regression analysis.
The significance tests in this analysis are equivalent to the
tests that would be done using two lines. In this analysis,
there is one dependent variable and three independent
variables. The dependent variable is the criterion. The
independent variables are shown below:
The predictor.
The group (a nominal dichotomous variable which
indicates whether the person is in the focal or reference
group). If this independent variable is significant, it
indicates that, if a separate regression were done for each
of the two groups, the intercepts of the regression lines
would be significantly different. Because the predictors
in this study were rescaled for these analyses such that the
intercepts occurred at the cut scores, a difference in
intercepts means that the two regression lines are at
different elevations at the cut score. That is, they have
different criterion scores at the predictors cut score.
The predictor by group interaction term. This is the
product of group (i.e., 0 or 1) and the predictor score. If this
independent variable is significant, it indicates that, if a
separate regression were done for each of the two groups,
the slopes of the regression lines would be significantly
different. The standardized slopes equal the validities.
FAIRNESS
Analyses
The fairness analyses requires analyses of job performance as well as test scores. As a consequence, all fairness
analyses were performed on the concurrent validation
45
#:418
The regression equation is shown below:

criterion = b0 + bpredictor predictor + bgroup group + binteraction interaction + error
The composite criterion and the composite predictor
were used for the fairness analyses. The composite criterion was the weighted sum of the composite rating and
the CBPM. Based on their relationships with the high
fidelity criterion measures, the ratings and CBPM were
assigned weights of .4 and .6 respectively. The ratings and
CBPM scores were standardized before they were added.
[Equation 5.6.1]
rion scores at the cut score scaled in standard deviation

units about the regression line17. A negative value indicates that the minoritys regression line was below the
reference groups line.
The table shows that the slopes of the regression lines
are very similar for almost all of the predictors. There are
no significant differences in either the slopes or intercepts that favor the whites or males, except for the EQ
Self-Awareness scale whose slope favors males. Therefore, the test battery is equally valid for all groups. In
addition, the intercepts for males and whites are above
the intercepts for females, blacks and Hispanics for
every predictor. Thus, there is no evidence of unfairness whatsoever.
The absence of significant differences between intercepts (at the cut score) in Table 5.6.6 shows that the
minority groups intercept (at the cut score) was never
significantly above the reference groups intercept. In
fact, the reverse was often true. That is, for many
predictors, the performance of the minority group was
over-predicted by the predictor score. The degree of overprediction was greatest for blacks and least for females.
Another way to examine fairness is to see if the group
differences are similar in the composite predictor and
composite criterion. Table 5.6.7 shows this analysis.
Although females, blacks, and Hispanics had lower
scores and passing rates on the composite predictor than
males and whites, these differences were virtually identical using the criterion scores. None of the discrepancies
were statistically significant.
Both the fairness analyses and the comparison of the
group differences on the predictor and criterion strongly
support the fairness of the final predictor battery score.
The slopes among the groups are very similar and the
differences in intercepts always favor the minority group.
The group differences in terms of passing rates and
differences in means are remarkably similar in the predictor compared to the criterion. The fairness analyses
provide strong evidence of fairness for the individual
tests as well.
RESULTS AND CONCLUSIONS

Examples of the fairness regression scatterplots are
shown in Figures 5.6.1, 5.6.2, and 5.6.3 below. The
regression lines for both groups (i.e., reference and
minority) are shown in each plot. The slopes of the two
regression lines are very similar in each of the three
graphs. Thus, the validities differ little between the
groups in each graph. The near-parallelism of the regression lines is reflected in the similar values of the two
groups standardized slopes listed in the graphs and in
Table 5.6.6. In terms of the intercepts, however, the
white and male regression lines are above the female,
Hispanic, and especially the black regression lines at the
cut score. Thus, the predictor composite over-predicts
performance for the three minority groups compared
with the reference groups, which means that the test
actually favors the minority groups. Under these circumstances, a regression equation based on the total sample
produces predicted job performance levels that are higher
than the actual performance levels observed for minorities. In a selection situation, minorities would be favored
in that they would achieve a higher ranking on a selection
list than would be indicated by actual performance.
Table 5.6.6 shows the results of the fairness regressions for all of the predictor scores. It displays the
standardized slopes for each regression line. These are
equivalent to validity coefficients. The table also shows
the Regression Lines Difference at Cut Score (in Std. Dev.
Units). This is the difference between the intercepts
divided by the reference groups standard error of estimate. Thus it can be considered to be the difference
between minority vs. reference groups predicted crite-
17
Linear regression assumes that the standard deviation of the criterion scores is the same at every predictor score. This is
called homoscedasdicity. In practice, this assumption is violated to varying degrees. Thus, in theory, the standard error of
estimate should equal the standard deviation of the criterion scores at the predictors cut scoreand at every other predictor
score as well. In practice, this is only an approximation.
46
#:419
TARGETED RECRUITMENT
The sample size of each of the groups is an important

issue in fairness regressions. If the samples are too small,
the analyses will be unable to detect statistically significant evidence of unfairness. Figure 5.6.4 below shows
the 95% confidence intervals for the slope. The graph
clearly shows the wide confidence band for Hispanics;
the moderate bands for females and blacks; and the
narrow bands for males, whites, and the entire sample.
The slopes at the bottom of all confidence bands are well
above zero which shows that the validity is statistically
significant for each group.
The power analyses were done to consider the possibility that the analyses were not sensitive enough (i.e., the
sample size was too small) to have discovered evidence of
unfairness (see Table 5.6.8). From the fairness regressions, the reference groups were compared with the
minority groups in terms of their slopes and intercepts.
For each pair of slopes and intercepts, the analyses
determined how small the difference (i.e., a difference
favoring the reference groups) between the groups would
have to be in the population to achieve a power level of
80%. A power level of 80% means that, if we ran the
analysis for 100 different samples, we would find a
statistically significant difference between the two groups
(i.e., minority vs. reference group) in 80 of those samples.
The power analyses showed that even relatively small
differences between groups would have been detected in
our fairness analyses. Due to its smaller sample size, the
Hispanic group has the largest detectable differences.
Table 5.6.8 shows the sizes of the smallest detectable
differences at 80% power and p < .05.
As indicated above, the AT-SAT Battery is equally

valid and fair for white, African American and Hispanics
as well as male and female groups. It was also shown in
Chapter 5.5 that there is a strong positive relationship
between AT-SAT test scores and job performance as an
air traffic controller. At the same time, the FAA has the
responsibility to try to have the workforce demographics
reflect the population of the nation in spite of mean test
score differences between groups. We believe that the
solution to the apparent contradictory goals of hiring
applicants with the highest potential for high job performance and maintaining an employee demographic profile that reflects the nations population is to staff the
ATCS positions with the use of targeted recruiting
efforts. Simply stated, targeting recruiting is the process
of searching for applicants who have a higher than
average probability of doing well on the AT-SAT test
battery and, therefore, have the skills and abilities required for performance as an ATCS. For example, one
recruiting effort might focus on schools that attract
students with high math ability.
Figure 5.6.5 shows the distribution of AT-SAT scores
from the pseudo-applicant sample, including scores for
all sample members, females, Hispanics, and African
Americans. Two important observations can be made
from an examination of Figure 5.6.5. First, there are
obvious differences in mean test scores between the
various groups. Secondly, there is a high degree of
overlap in the test score distributions of the various
groups. This high degree of overlap means that there are
many individuals from each of the different groups who
score above the test cut score. These are the individuals
one would seek in a targeted recruiting effort. It should
be noted that the targeted recruiting effort needs to be a
proactive process of searching for qualified candidates. If
no proactive recruitment effort is made, the distribution
of applicants is likely to be similar to that observed in
Figure 5.6.5.
On the other hand, the potential impact of targeted
recruiting on mean test scores is shown in Table 5.6.9. In
the total applicant sample, 18.8% of the applicants
would likely pass at the 70 cut off. If applicants from the
top 10% of the black population were recruited so that
they were 6 times more likely to apply, about 15.5%
would be expected to pass at the 70 cut off. The change
from 3.9% (no targeted recruiting) to 15.5% (with
targeted recruiting) represents an increase of about 300%
in the black pass rate.
DISCUSSION
Although many of the tests, including the final ATSAT battery score, exhibited differences between
groups, there is no reliable evidence that the battery is
unfair. The fairness analyses show that the regression
slopes are very similar among the groups (white, male,
female, black, Hispanic). There are differences among
the intercepts (at the cut score), but these differences
favor the minority groups. Thus, there is strong evidence that the battery is fair for females, blacks, and
Hispanics. These results show that the test battery is
equally valid for all comparison groups. In addition,
differences in mean test scores are associated with
corresponding differences in job performance measures. For all groups, high test scores are associated
with high levels of job performance and low scores are
associated with lower levels of job performance.
47
#:420
CHAPTER 6
THE RELATIONSHIP
OF
FAA ARCHIVAL DATA
TO
AT-SAT PREDICTOR AND CRITERION MEASURES
Carol A. Manning and Michael C. Heil

Federal Aviation Administration, Civil Aeromedical Institute
The FAA Civil Aeromedical Institute (CAMI) has
conducted research in the area of air traffic controller
selection and training for nearly 3 decades. As a result of
this research, CAMI established several Air Traffic Control Specialist (ATCS) data bases that contain selection
and training scores, ratings, and measures as well as
demographic information and other indices of career
progression. The archival data described below were
matched with AT-SAT predictor test and criterion performance scores for controllers participating in the concurrent validation study who agreed to have their historical
data retrieved and linked with the experimental selection
and performance data.
before 1981. (The other CSC tests were Computations,

Spatial Patterns, Following Oral Directions, and a test
that slightly resembled the MCAT). The Occupational
Knowledge Test was a job knowledge test that contained
items related to air traffic control phraseology and procedures. The purpose of using the Occupational Knowledge Test was to provide candidates with extra credit for
demonstrated job knowledge.
The MCAT comprised 80% of the initial qualifying
score for the OPM battery, while the Abstract Reasoning
Test comprised 20%. After these weights were applied to
the raw scores for each test, the resulting score was
transmuted to a distribution with a mean of 70 and a
maximum score of 100. If the resulting Transmuted
Composite score (TMC) was less than 70, the applicant
was eliminated from further consideration. If, however,
the applicant earned a TMC of 70 or above, he or she
could receive up to 15 extra credit points (up to a
maximum score of 100) based upon the score earned on
the Occupational Knowledge Test (OKT). Up to 10
extra credit points (up to a maximum score of 110) could
also be added based on Veterans Preference. The sum of
the TMC and all earned extra credit points was the final
OPM Rating.
This version of the OPM ATCS battery was implemented in September 1981, just after the Air Traffic
Controller strike. For some time after the strike, applicants were selected using either a score on the earlier CSC
battery or on the later OPM battery. Because of concerns
about artificial increases in test scores as a function of
training, changes were made in October 1985 to 1)
replace the versions of the MCAT that were used, 2)
change the procedures used to administer the MCAT,
and 3) change eligibility requirements for re-testing.
PREVIOUS ATC SELECTION TESTS

The United States ATCS selection process between
1981 and 1992 consisted of two testing phases: (a) a 4
hour written aptitude examination administered by the
United States Office of Personnel Management (OPM);
and (b) a multi-week screening program administered by
the FAA Academy. A description of these tests is presented below.
OPM Test Battery
The OPM test battery included the Multiplex Controller Aptitude Test, the Abstract Reasoning Test, and
the Occupational Knowledge Test. The Multiplex Controller Aptitude Test (MCAT) required the applicant to
combine visually presented information about the positions and direction of flight of several aircraft with
tabular data about their altitude and speed. The applicants
task was to decide whether pairs of aircraft would conflict by examining the information to answer the questions. Other items required computing time-distance
functions, interpreting information, and spatial orientation. Performance on the MCAT was reported as a single
score. The Abstract Reasoning Test (ABSR) was a civil
service examination (OPM-157) that included questions about logical relationships between either symbols
or letters. This was the only test retained from the
previous Civil Service Commission (CSC) battery in use
Academy Nonradar Screening programs

Because tens of thousands of people applied for the
job of Air Traffic Control Specialist (ATCS), it was
necessary to use a paper-and-pencil format to administer
the CSC/OPM batteries. With paper-and-pencil testing, it was difficult to measure aptitudes that would be
utilized in a dynamic environment. Consequently, there
49
#:421
continued to be a high attrition rate in ATCS field

training even for candidates who successfully completed
the initial selection process (earning a qualifying score on
the CSC/OPM selection battery, and passing both a
medical examination and a background investigation.)
In 1975, the Committee on Government Operations
authorized the FAA Academy to develop and administer
a second-stage selection procedure to provide early and
continued screening to insure prompt elimination of
unsuccessful trainees and relieve the regional facilities of
much of this burden.
In January of 1976, two programs were introduced at
the FAA Academy to evaluate students ability to apply
a set of procedures in an appropriate manner for the nonradar control of air traffic. From 1976 until 1985,
candidates entered either the 12-week En Route Initial
Qualification Training program (designed for new hires
assigned to en route facilities) or the 16-week Terminal
Initial Qualification Training program (designed for
new hires assigned to terminal facilities). While both
programs were based on non-radar air traffic control,
they used different procedures and were applied in
different types of airspace. Academy entrants were assigned to one program or the other on a more-or-less
random basis (i.e., no information about their aptitude,
as measured by the CSC/OPM rating, was used to assign
them to an option or facility). Those who successfully
completed one of the programs went on to a facility in
the corresponding option. Those who did not successfully complete one of the programs were separated from
the GS-2152 job series.
Both the En Route and Terminal Screen programs
contained academic tests, laboratory problems, and a
Controller Skills Test. The laboratory problems, each
one-half hour in length, required the student to apply the
principles of non-radar air traffic control learned during
the academic portions of the course to situations in
which simulated aircraft moved through a synthetic
airspace. Student performance was evaluated by certified
air traffic control instructors. Two scores, a Technical
Assessment (based on observable errors made) and an
Instructor Assessment (based on the instructors rating
of the students potential) were assigned by the grading
instructor for each problem. These assessment scores
were then averaged to yield an overall laboratory score for
a single problem.
The Controller Skills Test (CST) measured the application of air traffic control principles to resolve air traffic
situations in a speeded paper-and-pencil testing situation. The composite score in the program was based on
a weighted sum of the Block Average (BA; the average of

scores from the academic block tests), the Comprehensive Phase Test (CPT; a comprehensive test covering all
academic material), the Lab Average (the average score
on the best 5 of 6 graded laboratory problems), and the
Controller Skills Test (CST). A composite grade of 70
was required to pass. From 1976 until 1985, the same
weights were applied to the program components of both
the En Route and Terminal Screen programs to yield the
overall composite score: 2% for the Block Average, 8%
for the Comprehensive Phase Test, 65% for the Lab
Average, and 25% for the CST.
For those candidates entering the Academy after the
Air Traffic Controller strike of 1981, the pass rate in the
En Route Screen program was 52.3% and the pass rate
in the Terminal Screen program was 67.8%. The pass
rate in both programs combined was 58.0%. In October
of 1985, the two programs were combined to create the
Nonradar Screen program. The purpose of using a single
program was to allow facility assignments to be based,
when possible, upon the final grade earned in the program. The Nonradar Screen program was based upon the
En Route screen program (containing the same lessons
and comparable tests and laboratory problems). It was
necessary to change the weights applied to the individual
component scores of the Nonradar Screen program to
maintain the average pass rate obtained for both the En
Route and Terminal screen programs. The weights used
in the Nonradar Screen program to yield the overall
composite score were: 8% for the Block Average, 12%
for the Comprehensive Phase Test, 60% for the Lab
Average, and 20% for the CST. The pass rate for the
Nonradar Screen program was 56.6%.
The Pre-Training Screen
In 1992, the Nonradar Screen program was replaced
with the Pre-Training Screen (PTS) as the second-stage
selection procedure for air traffic controllers. The goals
of using the PTS were to 1) reduce the costs of ATCS
selection (by reducing the time required for screening
controllers from approximately 9 weeks to 5 days), 2)
maintain the validity of the ATCS selection system, and
3) support agency cultural diversity goals. The PTS
consisted of the following tests: Static Vector/Continuous Memory, Time Wall/Pattern Recognition, and Air
Traffic Scenarios Test. Broach & Brecht-Clark (1994)
conducted a predictive validity study using the final
score in the ATCS screen as the criterion measure. They
found that the PTS added 20% to the percentage of
variance explained in the Nonradar Screen Program final
50
#:422
score, over and above the contribution made by the

OPM test. Broach & Brecht-Clark (1994) also described
a concurrent validation study conducted using 297
developmental and Full Performance Level (FPL) controllers. The criterion used for this study was a composite
of supervisor ratings and times to complete field training, along with performance in the Radar Training
program. The corrected multiple correlation between
PTS final score and the training composite score was .25
as compared with .19, which was the multiple correlation between the ATCS screen score and the training
composite.
VanDeventer (1983) found that the biographical

question related to grades in high school mathematics
courses loaded .31 on a factor defined by pass/fail status
in the Academy screening program. Taylor, VanDeventer,
Collins, & Boone (1983) found that, for a group of 1980
candidates, younger people with higher grades in high
school math and biology, pre-FAA ATC experience, and
fewer repetitions of the CSC test, and a self-assessment
of performance in the top 10% of all controllers were
related to an increased probability of passing the Nonradar
Screen program. Collins, Manning, & Taylor (1984)
found that, for a group of trainees entering the Academy
between 1981 and 1983, the following were related to
pass/fail status in the Nonradar Screen program: higher
grades in high school math, physical science, and biology
classes, a higher overall high school grade point average,
younger age, not being a member of the armed forces,
taking the OPM test only one time, expectations of
staying in ATC work more than 3 years, and a selfassessment that the trainees performance would be in
the top 10% of all ATCSs were positively related to pass/
fail status. Collins, Nye, & Manning (1990) found, for
a group of Academy entrants between October 1985 and
September 1987, that higher mathematics grades in high
school, higher overall high school grade point average,
self assessment that less time will be required to be
effective as an ATCS, self-assessment that the trainees
performance level will be in the top 10% of all ATCSs,
and having taken the OPM test fewer times were related
to pass/fail status in the Academy screening program.
Radar Training (Phase XA)

A second screening program, the En Route Basic
Radar Training Course (otherwise known as RTF), was
administered to en route developmentals who had completed their Radar Associate/Nonradar on-the-job training. The RTF course was a pass/fail course, and
developmentals who did not pass were unable to proceed
in further radar training at their facilities unless they
recycled and later passed the course. However, the pass
rate in this phase of training exceeded 98%. The RTF
course paralleled the Nonradar Screen program, including an average grade on block tests (2% of the final
grade), a comprehensive phase test (8% of the final
grade), an average grade for laboratory evaluations (65%
of the final grade), and a Controller Skills Test (25% of
the final grade.)
OTHER ARCHIVAL DATA OBTAINED

FOR ATC CANDIDATES
16PF and Experimental Tests

Also available were scores from the Sixteen Personality Factor (16PF), which is administered during the
medical examination and scored with a revised key
(Cattell & Eber, 1962; Convey, 1984; Schroeder &
Dollar, 1997). Other tests and assessments were administered during the first week of the Academy screening
programs; however, they were often administered to a
limited number of classes. Consequently, these tests
would have been taken by only a few of the controllers
who passed the Academy, became certified in an en route
facility, and eventually participated in the concurrent
validation study. Only the Mathematics Aptitude Test
was taken by a sufficient number of participants to
include in these analyses.
Biographical Questionnaire
Additional information about controller demographics and experience was obtained from data provided by
Academy entrants during the first week they attended
one of the Academy screening programs and obtained
from the Consolidated Personnel Management Information System (CPMIS). New entrants completed a
Biographical Questionnaire (BQ). Different BQ items
were used for those entering the Nonradar Screen Program at various times. The BQ questions concerned the
amount and type of classes taken, grades earned in high
school, amount and type of prior air traffic and/or aviation
experience, reason for applying for the job, expectations
about the job, and relaxation techniques used.
51
#:423
ARCHIVAL CRITERION MEASURES
scale, (provided by an instructor or supervisor who most

frequently observed the student during that phase) were
collected. This information was compiled to derive
measures of training performance, such as the amount of
time (in years) required to reach full performance level
(FPL) status, mean instructor ratings of potential computed for OJT phases (called the Indication of Performance), the amount of time (in calendar days) required
to complete OJT in certain training phases, and the total
number of OJT hours required to complete those phases.
Data were used from only phases IX and XII because
those phases included the first two sectors on which
nonradar/radar associate (Phase IX) and radar (Phase
XII) training were provided.
These measures of training performance were collected because they were readily available for most trainees, but a number of outside factors besides aptitude and
technical proficiency could have affected their value.
Time required to reach FPL status could be affected by
delays in training caused by a number of factors, including the need for management to use a trainee to control
traffic on sectors on which he/she had already certified
instead of allowing him/her to participate in OJT, the
number of other students undergoing OJT in the same
airspace at the same time (limiting an individuals access
to OJT), or the number of trainees, (affecting the availability of the training simulation laboratory). The number of OJT hours required to certify on a specific sector
could be affected by the type of traffic the student
controlled during training or the difficulty of the sector.
The subjective rating of trainee potential could be affected by a number of rating biases familiar to psychologists, such as halo, leniency, etc. In spite of the
measurement problems associated with these training
performance measures, they were the best measures
available for many years to describe performance in
ATCS technical training programs.
Field Training Performance Measures as Criteria

Description of En Route ATCS Field Training
In the en route option, the unit of air traffic control
operation is the sector, a piece of airspace for which a
team of 2-3 controllers is responsible (during times of
slow traffic, only one controller may be responsible for a
sector). A group of between 5-8 sectors is combined into
what is called an area of specialization. An en route
controller is assigned to only one area of specialization,
but is responsible for controlling traffic for all sectors
within that area. The team of en route controllers working at a sector handles duties related to: Radar separation
of aircraft (radar duties; including formulating clearances to ensure separation and delivering them by radio
to pilots, handing off responsibility for an aircraft to
another controller); assisting the radar controller (radar
associate duties; including maintaining records about
clearances that have been issued or other changes in the
flight plan of an aircraft, identifying potential problems,
communicating information not directly related to aircraft separation of aircraft to pilots or other controllers);
or supporting other activities (assistant controller duties;
including entering data into the computer, ensuring that
all records of flight progress are available for the controller in charge).
En route controllers are usually trained as assistant
controllers first, then given training on increasingly
difficult responsibilities (radar associate duties, then
radar). Training on concepts is conducted in the classroom, before being applied in a laboratory setting, and
then reinforced during on-the-job training (OJT), which
is conducted in a supervised setting. At some facilities, all
radar associate training is completed before radar training begins. At other facilities, training is conducted by
position: Both radar associate and radar training are
provided for a specific position before training begins on
the next position. At one point in time, en route controllers
could have taken up to 9 phases of field training, depending
on the way training was provided at the facility.
HISTORICAL STUDIES OF VALIDITY OF

ARCHIVAL MEASURES
Brokaw (1984) reviewed several studies examining
the relationship between aptitude tests and performance
in both air traffic control training and on the job. He
described an early study (Taylor, 1952) that identified a
set of 9 tests having zero-order correlations of .2 or above
with supervisor job performance ratings or composite
criteria. A selection battery that included the following
tests was recommended but not implemented: Memory
Measures of Performance in Field Training

Several measures of training performance were obtained for each phase of air traffic control field training.
For each phase of training, the start and completion
dates, the number of hours used to complete on-the-job
training (OJT), the grade (Pass, Fail, or Withdraw), and
a rating of controller potential, measured on a 6-point
52
#:424
for Flight Information, Air Traffic Problems I & II,

Flight Location, Coding Flight Data I, Memory for
Aircraft Position, Circling Aircraft, Aircraft Position,
and Flight Paths.
A more extensive study was performed during a joint
Air Force Personnel Laboratory and Civil Aeronautics
Administration collaboration (Brokaw, 1957). Thirtyseven tests were administered to 130 trainees in an ATC
school. Criteria were based on performance in the ATC
course, including grades for the lecture, instructor ratings, and a composite of ratings from multiple instructors. Tests related to one or more of the training criteria
involved Computational and Abstract Reasoning (including Dial and Table Reading and Arithmetic Reasoning tests), Perceptual and Abstract Reasoning, Verbal
Tests, Perceptual Speed and Accuracy, and Temperament. The multiple correlation of four tests (Air Traffic
Problems, Arithmetic Reasoning, Symbol Reasoning
and Perceptual Speed, and Code Translation) with the
instructor rating was .51.
A follow-up study (Brokaw, 1959) was conducted to
examine the relationship between the experimental selection battery and supervisor ratings of on-the-job
performance. The multiple correlation of the same four
tests with the supervisor rating was .34. Trites (1961)
conducted a second follow-up study using Brokaws
1957 sample, obtaining supervisor ratings after hire.
Symbolic Reasoning and Perceptual Speed, Abstract
Reasoning (DAT), Space Relations (DAT), and Spatial
Orientation (AFOQT), were all significantly related to
supervisor ratings provided in 1961 (correlations were
.21, .18, .18, and .23, respectively.) The correlations
were reduced somewhat when partial correlations were
computed holding age constant. Furthermore, the Family Relations Scale from the California Test Bureau
(CTB) California Test of Personality had a .21 correlation with the 1961 supervisor ratings. The correlation
was not reduced by partialing out the effect of age.
Trites & Cobb (1963), using another sample, found
that experience in ATC predicted performance both in
ATC training and on the job. However, aptitude tests
were better predictors of performance in training than
was experience. Five aptitude tests (DAT Space Relations, DAT Numerical Ability, DAT Abstract Reasoning, CTMM Analogies, and Air Traffic Problems) had
correlations of .34, .36, .45, .28, and .37 with academic
and laboratory grades, while the correlations with supervisor ratings were lower (.04, .09, .12, .13, and .15,
respectively) for en route controllers.
Other studies have examined relationships between

experimental tests and performance in the FAA Academy
Screening Program. Cobb & Mathews (1972) developed
the Directional Headings Test (DHT) to measure speeded
perceptual-discrimination and coding skills. They found
that the DHT correlated .41 with a measure of training
performance for a group of air traffic control trainees
who had already been selected using the CSC selection
battery. However, the test was highly speeded, and was
consequently difficult to administer.
Boone (1979), in a study using 1828 ATC trainees,
found that the Dial Reading Test (DRT; developed at
Lackland AFB for selecting pilot trainees) and the DHT
had correlations of .27 and .23, respectively, with the
standardized laboratory score in the Academy screen
program. An experimental version of the MCAT correlated .28 with the lab score. In the same study, CSC 24
(Computations) and CSC 157 (Abstract Reasoning)
correlated .10 and .07, respectively, with the laboratory
score.
Schroeder, Dollar & Nye (1990) administered the
DHT and DRT to a group of 1126 ATC trainees after
the air traffic control strike of 1981. They found that the
DHT correlated .26 (.47 after adjustment for restriction
in range) with the final score in the Academy screening
program, while the DRT correlated .29 (.52 after adjustment for restriction in range) with the final score in the
Academy screening program. MCAT correlated .17 and
Abstract Reasoning correlated .16 with the final score,
though those two tests had been used to select the trainees.
Manning, Della Rocco, and Bryant, (1989) found
statistically significant (though somewhat small) correlations between the OPM component scores and measures of training status, instructor ratings of trainee
potential, and time to reach FPL (a negative correlation)
for 1981-1985 graduates of the en route Academy screening program. Correlations (not corrected for restriction
in range) of the MCAT with training status, OJT hours
in Phase IX, mean Indication of Performance for Phases
VIII-X, OJT hours in Phase XII, Indication of Performance in Phases XI-XIII, and time to FPL were -.12, .05,
.11, .08, .11, and -.11, respectively. Correlations (not
corrected for restriction in range) of the Abstract Reasoning Test with the same measures of field training performance were .03, .04, .03, .09, .01, and -.02, respectively.
Manning et al. also examined correlations between
component scores in the en route Academy screening
program and the same measures of field training performance. Correlations (not corrected for restriction in
53
#:425
range) of the Lab Average with training status, OJT

hours in Phase IX, Indication of Performance in Phases
VIII-X, OJT hours in Phase XII, Indication of Performance in Phase XII, and Time to FPL were -.24, -.06,
.23, -.12, .24, and -.16, respectively. Correlations (not
corrected for restriction in range) of the Nonradar Controller Skills Test with the same training performance
measures were -.08, -.02, .11, 0, .07, and -.09. Correlations (not corrected for restriction in range) of the
Final Score in the Screen with the same training
performance measures were -.24, -.06, .24, -.10, .24,
and -.18, respectively.
Manning (1991) examined the same relationships for
FY-96 graduates of the ATC screen program, assigned to
the en route option. Correlations (not corrected for
restriction in range) of the MCAT, Abstract Reasoning
Test, and OPM rating with status in field training were
.09, .03, and .09, respectively. When adjusted for restriction in range, these correlations were .24, .04, and .35,
respectively. Correlations (not corrected for restriction
in range) of the Lab Average, Controller Skills Test, and
Final Score in the Screen with status in field training were
.21, .16, and .24, respectively. When adjusted for restriction in range, these correlations were .36, .26, and .44,
respectively.
significant, but the correlation between the Indication of

Performance in Phase IX and the Indication of Performance in Phase XII was moderately high.
Correlations between time in training phases and the
composite criterion rating were statistically significant at
the .01 level, but were not very high. The CBPM was
significantly correlated with only the days and hours in
Phase XII, which described the outcome of training on
the first two radar sectors. It makes sense that the CBPM
would relate particularly to performance in radar training because the CBPM contains items based on radar
concepts. Correlations of both the ratings and the CBPM
with the Indication of Performance variables were either
non-significant or not in the expected direction (i.e.,
correlations of AT-SAT criteria with the indication of
performance variables should be positive while correlations with training times should be negative.)
Relationship of Archival Predictors with Archival
and AT-SAT Criterion Measures
Because the archival and AT-SAT criterion measures
are related, and because the ATCS job has changed little
in the last 15 years, the selection procedures previously
used by the FAA and the AT-SAT criterion measures
should be correlated. The following two tables show
relationships of the OPM rating and performance in the
Academy screen program with both the archival and ATSAT criterion measures. It should be remembered that
the controllers who participated in the concurrent validation study were doubly screenedfirst on the basis of
their OPM rating, then, second on the basis of their score
in the Academy Screen program. Current FPLs were also
reduced in number because some failed to complete
training successfully. Thus, there is considerable restriction in the range of the selection test scores.
Table 6.2 shows correlations of the archival selection
test scores (OPM Rating, final score in the Nonradar
Screen program, and final score in the Radar Training
program) with both the archival criterion measures and
the AT-SAT criterion measures. Correlations adjusted
for restriction in the range of the predictors are in
parentheses after the restricted correlations. The OPM
rating correlated .18 with the final score in the Nonradar
Screen program and .11 with the final score in the Radar
Training program. The OPM rating had very low correlations with archival criterion measures (although it was
significantly correlated with the Indication of Performance in initial radar training.) The OPM rating was
not significantly correlated with the rating composite,
but was significantly correlated with the CBPM score
RELATIONSHIPS BETWEEN ARCHIVAL

DATA AND AT-SAT MEASURES
Relationship of Archival and AT-SAT Criterion
Measures
It is expected that the measures of field training
performance used during the 1980s as criterion measures
to assess the validity of the OPM test and Academy
screening programs will also be significantly correlated
with the AT-SAT criterion measures. The magnitude of
these correlations might be lower than those computed
among the original archival measures because several
years have elapsed between the time when field training
occurred and the administration of the AT-SAT criterion measures.
Table 6.1 shows correlations between the archival
criterion measures and the AT-SAT criterion measures.
These correlations have not been adjusted for restriction
in the range of the training performance measures.
Correlations between days and hours in the same phase
of training were high, and correlations between days and
hours in different phases of training were moderate.
Correlations between the Indication of Performance and
time in the same or different phases of training were non-
54
#:426
(r = .22.) The final score in the Nonradar Screen program was significantly correlated with training times in
both phases of field training and with time to reach FPL
status, but not with either Indication of Performance
measure. The final score in the Nonradar Screen program was also significantly correlated with both ATSAT criterion measures, although the correlation with
the CBPM (.34) was much higher than the correlation
with the rating composite (.12). The final score in the
Radar Training program was also significantly correlated
with training times, and was significantly correlated with
the Indication of Performance for initial radar training.
It was also significantly correlated with both the ATSAT rating composite (.17) and the CBPM score (.21).
Table 6.3 shows correlations of the performancebased components of the archival selection procedures
(Nonradar Screen program and Radar Training program) with both the archival and AT-SAT criterion
measures. The correlations at the top of the table are
intercorrelations between archival selection procedure
components. Of the OPM component scores, only the
Abstract Reasoning Test and the MCAT were significantly correlated.
Correlations of components of the OPM battery with
component scores from the Nonradar Screen program
and the Radar Training program were fairly low, although some statistically significant correlations with
scores from the laboratory phases were observed. The
MCAT was significantly correlated with Instructor Assessment and Technical Assessment from both the
Nonradar Screen and Radar Training programs, and was
significantly correlated with the Nonradar CST. Abstract Reasoning was significantly correlated with only
the nonradar Average Technical Assessment and the
nonradar CST. The OKT had a small but statistically
significant correlation with the Nonradar Average Instructor Assessment.
The correlation between the Average Instructor Assessment and Average Technical Assessment from each
course was very high (.79 and .83, for the Nonradar
Screen program and Radar Training program, respectively.) Across programs the Average Instructor Assessment and Average Technical Assessment had significant
correlations that ranged between about .02 and .35. The
Controller Skills Tests for both courses had significant
correlations with the Nonradar Average Technical and
Average Instructor Assessment. While the Nonradar
CST was significantly correlated with the Radar Average
Instructor and Technical Assessments, the Radar CST
was not. Correlation between CSTs was only .25, which

was similar to correlations with other components of the
Nonradar Screen and Radar Training programs.
Correlations of OPM component scores with the
rating criterion measure were all low and non-significant. However, the MCAT and Occupational Knowledge Tests were both significantly correlated with the
CBPM score.
Of the components of the Nonradar Screen and
Radar Training programs, the Average Technical Assessment had significant negative correlations with training
times (though not with the Indication of Performance
measures). The Radar Technical Assessment was correlated both with time spent in Radar Associate and Radar
field training phases, while the Nonradar Technical
Assessment was only correlated with time spent in Radar
field training phases. Both were significantly correlated
with the Time required to reach FPL status. The Radar
Average Instructor Assessment was significantly correlated with time spent in Radar Associate field training.
Interestingly, the Nonradar Average Instructor Assessment was not related to time in phases of field training,
although its correlation with the Nonradar Average
Technical Assessment was about .8. Both the Nonradar
and Radar Average Instructor Assessment were significantly correlated with time to reach FPL status.
The Nonradar and Radar Average Technical Assessments and Average Instructor Assessments were all significantly related to the CBPM score, though only the
Nonradar Average Instructor Assessment was significantly related to the rating composite. Both the Nonradar
and Radar Controller Skills Tests were significantly
correlated with the CBPM. This relationship is not
surprising because the CSTs and CBPM have similar
formats: They all present a sample air traffic situation
and ask the respondent to answer a multiple choice
question (under time pressure) involving the application
of ATC procedures. The CSTs were presented in a
paper-and-pencil format while the CBPM was presented
using a dynamic computer display.
Relationship of Archival Criteria and High-Fidelity
Simulation Criteria
Table 6.4 shows correlations of the criterion measures
obtained from the high-fidelity simulation (comprising
107 participants) with archival performance-based predictor and archival criterion measures. The high-fidelity
criterion measures used in this analysis included the
individual scales used in the Over-the-Shoulder rating
55
#:427
form. Also used was the number of operational errors

made during the 7th graded scenario, the most complex
scenario included in the simulation test. The highfidelity rating scales were correlated very highly with
each other (.80 and above). The number of operational
errors made in the 7th graded scenario was correlated -.20
to -.32 with the high fidelity rating scales, which were
based on performance in all 7 graded scenarios. The
high-fidelity rating scales (based on assessments of maximum performance) had correlations of about .35 to
about .40 with the AT-SAT rating composite (based on
assessments of typical performance), and had correlations of about .60 to .65 with the CBPM. The number
of operational errors made in the 7th graded scenario was
not significantly correlated with either the AT-SAT
rating composite or the CBPM.
The high-fidelity rating scales were not correlated
with either Indication of Performance measure obtained
from field training records. OJT hours in Phase IX
(Radar Associate/Nonradar training) had significant
negative correlations with several individual high-fidelity rating scales, including the overall rating. OJT hours
in Phase XII (field Radar training) had significant negative correlations with all high-fidelity ratings scales except Coordination. Time to reach FPL status had
significant negative correlations with only Maintaining
efficient air traffic flow and with Attention & Situation
Awareness.
The high-fidelity rating scales had higher, significant,
correlations with some of the performance-based components of the archival selection procedures. The highfidelity rating scales were correlated between about .35
and .40 with the Average Instructor Assessment from the
Nonradar Screen program, and were correlated between
about .5 and .55 with the Average Technical Assessment
from the Nonradar Screen program. There were only
two significant correlations between the Controller Skills
Test from the Nonradar Screen program and the highfidelity rating scales (Coordination and Managing Sector Workload). The high-fidelity rating scales had almost
no correlation with the Average Instructor Assessment
from the Radar screen program but were correlated
between about .55 and .60 with the Average Technical
Assessment from the Radar screen program. Performance on the Controller Skills Test from the Radar
screen program was correlated between about .60 and
.71 with the high-fidelity rating scales. Though many of
these correlations are statistically significant, they were
typically based on fewer than 60 participants who allowed their archival data to be matched with their
performance in the AT-SAT testing and the high fidelity

simulation testing. At the same time, it is interesting to
observe correlations of the magnitude seen here between
measures of performance from simulations that occurred
recently and from performance-based selection procedures that occurred between 5 and 15 years previously.
Relationship of Archival Measures and AT-SAT
Predictors
It was also expected that archival measures, including
archival selection tests and scores on experimental tests
administered at the FAA Academy during the first week
of the Academy screen program might have high correlations with AT-SAT predictor tests. High correlations
between AT-SAT predictors and other aptitude tests
should provide evidence supporting interpretations of
the construct validity of the AT-SAT tests. The magnitude of these correlations might be reduced, however,
because the experimental tests were administered between 5 and 15 years prior to the concurrent validity
study and the OPM test was probably administered
between 6 and 16 years previously.
An analysis was conducted to compute correlations
between scores on the OPM selection tests: the Multiplex Controller Aptitude Test (MCAT), the Abstract
Reasoning Test, and the Occupational Knowledge Test
(OKT), and the AT-SAT predictor tests. The MCAT,
the highest weighted component of the OPM rating,
required integrating air traffic information to make
decisions about relationships between aircraft. Thus,
aptitudes required to perform well on the MCAT might
be related to aptitudes required to perform well on the
Air Traffic Scenarios Test (ATST). Furthermore, the
skills required to integrate information when taking the
MCAT might be related to performance on the Letter
Factories Test, Time Wall, Scan, and Planes tests. Positive correlations of the AT-SAT predictors with the
MCAT, a test previously used to select controllers,
would provide further evidence of the validity of the tests
included in the AT-SAT battery.
Table 6.5 shows correlations of the MCAT, Abstract
Reasoning Test, and OKT with the AT-SAT predictor
tests. The computed correlations are followed in parentheses by correlations adjusted for restriction in the range
of each archival selection test. (Correlations for the OKT
were not adjusted for restriction in range because the
standard deviation of the OKT after candidates were
selected was larger than was its standard deviation before
applicants were selected.)
56
#:428
MCAT had significant, but small, correlations with

many of the AT-SAT predictor tests: all measures derived from the Letter Factories test, Applied Math, Time
Wall Time Estimation Accuracy and Perceptual Accuracy scores (but not Perceptual Speed), Air Traffic Scenarios Efficiency and Safety scores (but not Procedural
Accuracy), Analogies Reasoning score (but not Latency
or Information Processing), Dials, Scan, both Memory
tests, Digit Span, Planes Timesharing score (but not
Projection or Dynamic Visual/Spatial), and Angles.
Abstract Reasoning was also significantly correlated
with several of the AT-SAT predictor tests. The relationship of the most interest is with the component scores of
the Analogies test. Abstract Reasoning might be expected
to have a high correlation with Analogies because many
items in both tests are similar. Thus, it is not surprising
to observe a correlation of .33 between Abstract Reasoning and the Analogies: Reasoning score. However, the
correlation of Abstract Reasoning with the Latency and
Information Processing components was non-significant. Abstract Reasoning also correlated with other ATSAT predictor tests: all Letter Factories subscores, Angles,
Applied Math, Time Wall: Time Estimation Accuracy
and Perceptual Accuracy (but not Perceptual Speed),
both Memory tests, Dials, Scan, and AT Scenarios:
Efficiency and Safety (but not Procedural Accuracy).
The Occupational Knowledge Test measured the
knowledge about aviation and air traffic control that
applicants brought to the job. The OKT had several
significant correlations with AT-SAT predictor tests,
although all but one was negative, implying that controllers who entered the occupation with less knowledge of
ATC performed better on the AT-SAT aptitude tests.
OKT was negatively correlated with Letter Factories
Situational Awareness and Planning & Thinking ahead
scores (but was not significantly correlated with number
of letters correctly placed), both memory tests, Time
Wall Perceptual Accuracy score, and Planes Dynamic
Visual/Spatial score. OKT had a significant positive
correlation with AT Scenarios Procedural Accuracy score.
Although many of these correlations are statistically
significant, they are nevertheless small, which might
appear to suggest that they do not provide evidence of the
construct validity of the AT-SAT predictor tests. Moreover, most of the correlations continued to be rather
small after they were adjusted for restriction in the range
of the archival selection tests. However, it must be
remembered that the participants in the concurrent
validity study were doubly (and even triply) selected,
because they first qualified on the basis of their perfor-
mance on the OPM test, then by passing the Nonradar

Screen program (which had approximately a 40% loss
rate), then again by passing field training (which had
approximately an additional 10% loss rate). Thus, even
making one adjustment for restriction in range does not
compensate for all the range restriction that occurred.
Furthermore, performance on the AT-SAT predictor
tests may have been influenced by age-related effects.
Archival Experimental Tests and AT-SAT Predictors. The next analysis examined the relationship of the
Dial Reading Test (DRT), the Directional Headings
Test (DHT), and two other archival measures of mathematical aptitude with AT-SAT predictor tests. The Dial
Reading Test is a paper-and-pencil version of the computerized AT-SAT Dials test, and so it would be expected that scores would be highly correlated. The DHT
was an experimental test administered to ATC trainees
during the 1970s. the DHT required comparing three
pieces of information: A letter (N, S, E, or W), a symbol
(^, v, <, or >), and a number (0 to 360 degrees), all
indicating direction, in order to determine whether they
indicated a consistent or inconsistent directional heading. A second part of the test required determining the
opposite of the indicated direction. Thus, performance
on the DHT might be expected to correlate positively
with both Angles and Applied Math.
The Math Aptitude Test was taken from the Educational Testing Service (ETS) Factor Reference Battery
(Ekstrom, French, Harman, & Derman, 1976). An item
dealing with reported grades in high school math courses
was also included in the analysis because this biographical information was previously found to be related to
success in the Nonradar Screen program.
Although these tests were administered between 5 and
15 years before the concurrent validation study, it is
expected that the DHT and DRT would be at least
moderately related to performance on some of the ATSAT predictor tests, especially those related to mathematical skills. It may be remembered that in past
research, the DHT and DRT had moderate correlations
with criterion measures of performance in ATC training.
Thus, positive correlations of the AT-SAT predictors
with the DHT and DRT would provide further evidence
of the validity of the AT-SAT tests.
Table 6.6 shows the relationship of three AT-SAT
predictor tests with DHT, DRT, the Math Aptitude
Test, and a biographical item dealing with high school
math grades. Numbers of respondents are shown in
parentheses after the correlation coefficient. As expected,
Applied Math had a high, positive correlation with the
57
#:429
Math Aptitude Test total score (.63). Applied Math had

also statistically significant and reasonably high positive
correlations with Dial Reading Number Correct (.52)
and Directional Headings Number Correct Part 2 (.40).
Applied Math also had moderate, significant negative
correlations with Dial Reading Number items wrong (.36) and the biographical item dealing with high school
math grades (-.34).
Angles was significantly correlated with Dial Reading
Number Correct (.37) and Dial Reading Number Wrong
(-.28). Angles was also significantly correlated with the
Math Aptitude Test (.41) and the biographical item
dealing with high school math grades (-.21). Unexpectedly, Angles had a small positive (but significant) correlation with Directional Headings number wrong Part 2 (.18).
The results of the comparison of the Dials test and the
archival experimental tests was somewhat surprising.
Dials had a significant positive correlation with Dial
Reading number correct (.22) and a significant negative
correlation with Dial Reading number wrong (-.39).
However the correlation with Dial Reading number
correct was low, considering that Applied Math and
Angles had higher correlations than did Dials. However,
Dials did not contain all the same items as Dial Reading.
After the Alpha testing, certain items present in Dial
Reading were removed from Dials, and other items were
inserted. Moreover, Dial Reading was presented in a
paper-and-pencil format while Dials was presented in a
computerized format. One might speculate that the
different formats were responsible for the reduced correlation. However, it must be remembered that Dial
Reading Test was administered between 5 and 15 years
prior to the administration of Dials, and considerable
training and aging occurred during the interim. While
air traffic controllers in the en route environment may
not read dials, they are trained extensively on other tasks
involving perceptual speed and accuracy, which is an
aptitude that the Dials test is likely to measure. Thus, it
is more likely that the low correlation between Dial
Reading and Dials resulted from changes in the items,
and the effects of time and aging on the individuals
taking the test, rather than a change in the format of the test.
Pre-Training Screen and AT-SAT Predictors. In
1991, a group of 297 developmental and FPL controllers
participated in a study assessing the validity of the PreTraining Screen (Broach & Brecht-Clark, 1994). Sixtyone controllers who participated in the concurrent
validation of the PTS also participated in the AT-SAT
concurrent validation in 1997/1998.
Scoring algorithms used for the PTS version of the

ATST differed from those used for the AT-SAT version
of the ATST. In the PTS version, the Safety score was a
count of safety-related errors and Delay Time measured
the amount of time aircraft were delayed. For both the
Safety score and Total Delay Time, higher scores indicated worse performance. In the AT-SAT version, the
Safety and Efficiency scores were based on counts of
errors and measurement of delays, but both variables
were transformed so that higher scores indicated better
performance. Procedural Accuracy is a new variable
based on the occurrence of errors not related to safety. It
is expected that the PTS Safety Score would be more
highly correlated with the AT-SAT Safety score than
with the AT-SAT Efficiency Score and that PTS Total
Delay Time would be more highly correlated with the
AT-SAT Efficiency Score than with the AT-SAT Safety
Score. It is also expected that the two PTS scores would
have significant negative correlations with the three ATSAT scores.
Table 6.7 shows the relationship of the scores from
the version of the Air Traffic Scenarios Test included in
the Pre-Training Screen with the version of the Air
Traffic Scenarios Test included in AT-SAT. As expected, the PTS Safety Score is more highly correlated
with the AT-SAT Safety Score than with the AT-SAT
Efficiency Score (and those correlations are negative).
Also, the correlation between the PTS Average Total
Delay Time and AT-SAT Efficiency Score was both
significant and negative. The Procedural Accuracy score
from the AT-SAT version had little relationship with
either PTS ATST score.
Archival Data and the Experience Questionnaire.
The merging of the archival data with the AT-SAT
concurrent validation data provided an opportunity to
investigate the construct validity of the personality test
contained in the AT-SAT battery. Construct validity of
the Experience Questionnaire (EQ) was investigated
using the following methods: principal component analysis to determine structure of the scale; and Pearson
product-moment correlations to determine the degree of
convergence and divergence with archival 16PF data.
The 167 items contained in the EQ were used to
calculate 14 personality scales, which were used in the
analyses.
In terms of the principal components analysis, a final
solution revealing at least two independent factors would
provide evidence that the EQ scales measure unique
constructs. Relationships between some of the EQ scales
58
#:430
would be anticipated, therefore, certain scales should

load on the same factor. However, some scales should be
unrelated, meaning that they should load on different
factors. For example, taking charge and decisiveness
are likely to be related and therefore load together on a
factor. The variable concentration, on the other hand,
would not be expected to have a high degree of relationship with these other two variables and should load on a
different factor. An oblique principal components analysis was conducted using data collected during the ATSAT concurrent validation study. As shown in Table
6.8, the principal components analysis resulted in the
extraction of only two factors. The first factor accounts
for 56% of the variance, whereas the second factor
accounts for only 9.49%. Additionally, these two factors
are correlated with each other (r=.54). These results
suggest that the variance in EQ scores is best explained
by one primary factor, although a small percentage is
explained by a related factor. For the most part, the EQ
scales are related to each other even when they should
theoretically be distinct. The results of this principal
components analysis fail to provide support for the
independence of the EQ scale scores.
Further support for the construct validity of the EQ
was sought by comparing scale scores to archival 16PF
scores. Although the 16PF is not necessarily the standard
by which all personality tests are measured, it is, in fact,
an established measure of personality traits that is widely
used in clinical and experimental settings. The merging
of these two data bases resulted in 451 usable cases. A
description of the 16PF scales included in the analyses is
provided in Table 6.9. Certain relationships would be
expected to exist between scores from the two tests.
Specifically, there would be support for the construct
validity of the EQ scales if they correlate with 16PF scales
that measure a similar construct. Conversely, the EQ
scales would be expected to be unrelated to 16PF scales
that measure other constructs. Since the 16PF was
administered several years before the EQ, these expected
relationships are based on the assumption that measurement of personality characteristics remains relatively
stable over time. This assumption is supported by Hogan
(1996) and Costa & McCrae (1988). A summary of the
expected relationships between EQ and 16PF scale scores
is provided below.
The EQ Composure scales should be positively correlated with 16PF Factor C (emotionally stable), which
would indicate that people high in composure are also
more emotionally stable and calm. EQ Task Closure and
EQ Consistency of work behavior should be positively
correlated with 16PF Factor G (conscientiousness). EQ

Working Cooperatively should be positively correlated
with 16PF Factors A (outgoing and participating) and
Q3 (socially precise) as well as negatively correlated with
Factor L and Factor N (which would indicate that these
people are trusting and genuine). Furthermore, it would
be expected that a high score on EQ Decisiveness and
EQ Execution would be negatively correlated with 16PF
Factor 0, meaning that decisive people would also be
expected to be self-assured and secure. EQ Flexibility
should have a positive correlation with 16PF Factor A
and a negative correlation with Factor Q4 (relaxed).
The EQ Tolerance for High Intensity scale would be
expected to be positively correlated with 16PF Factor H
(Adventurous) and negatively correlated with Factor O
(Apprehensive). EQ Self-Awareness and EQ Self-Confidence should both be negatively correlated with 16PF
Factor O (Apprehensive). A positive correlation between
EQ Self-Confidence and 16PF Factor Q2 (Self-sufficient) might also be expected. EQ Sustained Attention
and EQ Concentration should be related to 16PF Factor
G (conscientiousness) whereas EQ Taking Charge should
be related to 16PF Factor H (Adventurous) and Factor
E (Assertive). Finally, EQ Interpersonal Tolerance should
be positively correlated with 16PF Factor I (Tenderminded), Factor Q3 (socially precise), and Factor C
(Emotionally Stable).
Scores on the EQ and 16PF scales were compared
using Pearson product-moment correlations, the results
of which are presented in Table 6.10. The results of
correlational analyses between the EQ scales shows that
they are all inter-related. However, this is not surprising
considering the results of the principal components
analysis described above. Although relationships between some of the scales contained in a personality
measure are not unusual, moderate to high correlations
between all of the scales is another matter.
As stated earlier, the EQ scores were compared to
16PF Factor scores in an effort to support construct
validity by determining whether or not these scales
measure what they are purported to measure. Although
statistically significant, the correlations between EQ and
16PF scales represent small effect sizes and are not of the
magnitude desired when attempting to support the
validity of a test. The statistical significance of these
relationships is most likely an artifact of sample size. For
the most part, the pattern of relationships with 16PF
scales was the same for all EQ scales. This would not be
expected if the EQ scales did in fact measure different
constructs. This pattern is not unexpected given the EQ
59
#:431
inter-scale correlations and the results of the principal

components analysis. The results of these analyses fail to
provide evidence that the EQ scales measure unique
constructs, let alone the specific constructs they are
professed to measure. However, there are indications
that the EQ contributes to the prediction of AT-SAT
criterion measures (Houston & Schneider, 1997). Consequently, CAMI will continue to investigate the construct validity of the EQ by comparing it to other
personality measures such as the NEO PI-R.
Regression of Archival Selection Procedures and
AT-SAT Tests on AT-SAT Criteria. A multiple linear
regression analysis was conducted to assess the contribution of the AT-SAT tests in predicting the AT-SAT
criterion, over and above the contribution of the OPM
rating and final score from the Nonradar Screen program. The regression analysis used OPM rating, final
score in the Nonradar Screen program, and AT-SAT test
scores as predictors, and the weighted composite of ATSAT criterion measures as the criterion variable. To
compute the weighted composite criterion measure, the
CBPM received a .6 weighting while the AT-SAT rating
composite received a .4 weighting. A stepwise regression
was used.
Table 6.11 shows the results of the analysis. A model
was identified that contained the following variables:
Analogies Reasoning score, final score from the Nonradar
Screen program, Applied Math Number Correct, Scan
Total score, EQ Unlikely virtues scale, and Air Traffic
Scenarios Procedural Accuracy score produced a multiple regression coefficient of .465, accounting for about
22% of the variance in the AT-SAT composite criterion
variable. It is interesting that the final score in the
Nonradar Screen program contributed so much to the
prediction of the criterion measure, because there was
considerable restriction in the range of that variable. At
least 40% of the candidates failed the Nonradar Screen
program and were removed from employment, and
another 10% failed field training and were also removed
from employment or reassigned to another type of air
traffic facility.
It may appear surprising that more of the AT-SAT
predictor tests were not included in this model, but they
probably accounted for similar parts of the variance in
the AT-SAT composite criterion measure that were also
accounted for by the final score in the Nonradar Screen
program. For example, the Safety and Efficiency scores
from Air Traffic Scenarios Test, Applied Math, Angles,
the Letter Factories: Number of letters correctly placed,
Planning & Thinking Ahead, and Situation Awareness

scores, EQ: Composure & Self-Confidence scales all had
significant correlations with the final score in the
Nonradar Screen program. On the other hand, the
Unlikely Virtues scale from the EQ probably tapped a
part of the variance in the AT-SAT composite criterion
measure that was not already tapped by another AT-SAT
predictor test or by the final score in the Nonradar Screen
program. The Unlikely Virtues scale will not be included
as part of the selection battery, but will be retained to
provide information about whether the applicant is
faking responses on the rest of the EQ scales.
Discussion
Several analyses were conducted to examine interrelationships between archival selection tests, archival criterion measures, and experimental tests administered to
candidates entering the Academy for the Nonradar Screen
program. The purpose of these analyses was to assess the
construct validity of the AT-SAT criterion measures and
predictors. The results of the analyses supported the
interpretation of the AT-SAT measures discussed in
other chapters of this report.
For example, the amount of time required to complete various phases of field training, which were used as
archival criterion measures, were related to the AT-SAT
rating composite. Also, the OPM rating, the final score
in the Nonradar Screen program, and the final score in
the Radar Training program, were all significantly correlated with the CBPM. The final score in the Nonradar
Screen program and the final score in the Radar Training
program were both significantly correlated with the ATSAT rating composite. Also, the component tests of the
OPM Battery, the Nonradar Screen program, and the
Radar Training program all had significant correlations
with the CBPM. Furthermore, all scales from the Overthe-shoulder rating form used in the high-fidelity simulation (which were significantly correlated with both the
CBPM and the AT-SAT rating composite) were also
significantly correlated with both the Instructor Assessment and Technical Assessment ratings made during
both the Nonradar Screen program and the Radar Training program. These results suggest that the CBPM and
the composite ratings are related to measures used in
the past as criterion measures of performance in air
traffic control.
Additional analyses suggest that the AT-SAT predictors are also related to tests previously used to select air
traffic controllers. The MCAT was correlated with many
60
#:432
REFERENCES
of the AT-SAT predictor tests, especially those involving dynamic activities. The Abstract Reasoning test
had a particularly high correlation with the Analogies
Reasoning score, but was also correlated with other
AT-SAT predictors.
Other tests, administered experimentally to air traffic
control candidates between the years of 1981 and 1995,
provided additional support for the construct validity of
AT-SAT predictor tests. For example, the Math Aptitude Test from the ETS Factor Reference Battery
(Ekstrom et al., 1976), the Dial Reading Test, and a
biographical item reporting high school math grades
(which was previously shown to be correlated with
success in the Nonradar Screen program) had high
correlations with the Applied Math Test. The Angles
and Dials tests were also correlated with Dial Reading,
Math Aptitude, and the biographical item reporting
high school math grades. These results are not surprising,
considering the consistent relationship, observed over
years of research, between aptitude for mathematics and
various measures of performance in air traffic control.
Finally, a multiple linear regression analysis was conducted which showed that several of the AT-SAT tests
contributed to the prediction of the variance in the ATSAT composite criterion measure over and above the
OPM rating and the final score in the Nonradar Screen
program. The OPM battery and Nonradar Screen program provided an effective, though expensive, two-stage
process for selecting air traffic controllers that was used
successfully for many years. It appears that the AT-SAT
battery has equivalent, or better, predictive validity than
did the former selection procedure, and costs much less
to administer. Thus, it should be an improvement over
the previous selection process.
To maintain the advantage gained by using this new
selection procedure, it will be necessary to monitor its
effectiveness and validity over time. This will require
developing parallel forms of the AT-SAT tests, conducting predictive validity studies, developing and validating
new tests against criterion measures of ATC performance,
and replacing old tests with new ones if the former become
compromised or prove invalid for any reason.
Aerospace Sciences, Inc. (1991). Air traffic control specialist pre-training screen preliminary validation.
Fairfax, VA: Aerospace Sciences, Inc.
Alexander, J., Alley, V., Ammerman, H., Fairhurst, W.,
Hostetler, C., Jones, G., & Rainey, C. (1989,
April). FAA air traffic control operation concepts:
Volume VII, ATCT tower controllers (DOT/
FAA/AP-87/01, Vol. 7). Washington, DC: U.S.
Administration.
Alexander, J., Alley, V., Ammerman, H., Hostetler, C.,
operation concepts: Volume II, ACF/ACCC terminal and en route controllers (DOT/FAA/AP87/01, Vol. 2, CHG 1). Washington, DC: U.S.
Administration.
Alexander, J., Ammerman, H., Fairhurst, W., Hostetler,
C., & Jones, G. (1989, September). FAA air traffic
control operation concepts: Volume VIII,
TRACON controllers (DOT/FAA/AP-87/01, Vol.
8). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration.
Alley, V., Ammerman, H., Fairhurst, W., Hostetler, C.,
operation concepts: Volume V, ATCT/TCCC
tower controllers (DOT/FAA/AP-87/01, Vol. 5,
CHG 1). Washington, DC: U.S. Department of
Transportation, Federal Aviation Administration.
Ammerman, H., Bergen, L., Davies, D., Hostetler, C.,
Inman, E., & Jones, G. (1987, November). FAA
air traffic control operation concepts: Volume VI,
ARTCC/HOST en route controllers (DOT/FAA/
AP-87/01, Vol. 6). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration.
Ammerman, H., Fairhurst, W., Hostetler, C., & Jones,
G. (1989, May). FAA air traffic control task knowledge requirements: Volume I, ATCT tower controllers (DOT/FAA/ATC-TKR, Vol. 1). Washington, DC: U.S. Department of Transportation,
Federal Aviation Administration.
61
#:433
Broach, D. (1996, November). Users Guide for v4.0 of

the Air Traffic Scenarios Test for Windows(
(WinATST). Oklahoma City, OK: Federal Aviation Administration Civil Aeromedical Institute,
Human Resources Research Division.
Ammerman, H., Fligg, C., Pieser, W., Jones, G., Tischer,

K., Kloster, G. (1983, October). Enroute/terminal ATC operations concept (DOT/FAA/AP-83/
16) (CDRL-AOO1 under FAA contract DTFA0183-Y-10554). Washington, DC: U.S. Department
of Transportation, Federal Aviation Administration, Advanced Automation Program Office.
Brokaw, L. D. (1957, July). Selection measures for air

traffic control training. (Technical Memorandum
PL-TM-57-14). Lackland Air Force Base, TX:
Personnel Laboratory, Air Force Personnel and
Training Research Center.
Bobko, P., Nickels, B. J., Blair, M. D., & Tartak, E. L.

(1994). Preliminary internal report on the current
status of the SACHA model and task interconnections: Volume I.
Brokaw, L. D. (1959). School and job validation of

selection measures for air traffic control training.
(WADC-TN-59-39). Lackland Air Force Base,
TX: Wright Air Development Center, United
States Air Force.
Boone, J. O. (1979). Toward the development of a new

selection battery for air traffic control specialists.
Brokaw, L. D. (1984). Early research on controller

selection: 1941-1963. In S. B. Sells, . T. Dailey,
E. W. Pickrel (Eds.) Selection of Air Traffic
Controllers. (DOT/FAA/AM-84/2). Washington, DC: U.S. Department of Transportation,
Boone, J., Van Buskirk, L., & Steen, J. (1980). The

Federal Aviation Administrations radar training
facility and employee selection and training (DOT/
FAA/AM-80/15). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Buckley, E. P., & Beebe, T. (1972). The development of

a motion picture measurement instrument for
aptitude for air traffic control (DOT/FAA/RD71/106). Washington, DC: U.S. Department of
Systems Research and Development Service.
Borman, W. C. (1979). Format and training effects on

rating accuracy and rater errors. Journal of Applied Psychology, 64, 410-421.
Borman, W. C., Hedge, J. W., & Hanson, M. A. (1992,
June). Criterion development in the SACHA
project: Toward accurate measurement of air traffic control specialist performance (Institute Report #222). Minneapolis: Personnel Decisions
Research Institutes.
Buckley, E. P., DeBaryshe, B. D., Hitchner, N., &

Kohn, P. (1983). Methods and measurements in
real-time air traffic control system simulation
(DOT/FAA/CT-83/26). Atlantic City, NJ: U.S.
Administration, Technical Center.
Boone, J. O (1979). Toward the development of a new

selection battery for air traffic control specialists
Buckley, E. P., House, K., & Rood, R. (1978). Development of a performance criterion for air traffic
control personnel research through air traffic control simulation. (DOT/FAA/RD-78/71). Washington, DC: U.S. Department of Transportation,
Federal Aviation Administration, Systems Research
and Development Service.
Broach, D. & Brecht-Clark, J. (1994). Validation of the

Federal Aviation Administration air traffic control
specialist pre-training screen (DOT/FAA/AM-94/
4). Oklahoma City, OK: U.S. Department of
Office of Aviation Medicine.
62
#:434

performance indices for the air traffic control system (Final report) (DOT/FAA/NA-69/40; DOT/
FAA/RD-69/50; Government accession #710795).
Atlantic City, NJ: U.S. Department of Transportation, Federal Aviation Administration, National
Aviation Facilities Experimental Center, Systems
Research and Development Service.
Collins, W. E., Manning, C. A., & Taylor, D. K.

(1984). A comparison of prestrike and poststrike
ATCS trainees: Biographic factors associated with
Academy training success. In A. VanDeventer,,
W. Collins, C. Manning, D. Taylor, & N. Baxter
(Eds.) Studies of poststrike air traffic control specialist trainees: I. Age, biographical actors, and
selection test performance. (DOT/FAA/AM-84/
Buckley, E. P., OConnor, W. F., Beebe, T., Adams,

W., & MacDonald, G. (1969). A comparative
analysis of individual and system performance
indices for the air traffic control system (DOT/
FAA/NA-69/40). Atlantic City, NJ: U.S. Department of Transportation, Federal Aviation Administration, Technical Center.
Collins, W. E., Nye, L. G., & Manning, C. A.. (1990).

Studies of poststrike air traffic control specialist
trainees: III. Changes in demographic characteristics of Academy entrants and bio-demographic
predictors of success in air traffic control selection
and Academy screening. (DOT/FAA/AM-90/4).
Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office of
Aviation Medicine.

performance indices for the air traffic control system (DOT/FAA/NA-69/40). Atlantic City, N.J:
U.S. Department of Transportation, Federal Aviation Administration, National Aviation Facilities
Experimental Center.
Convey, J. J. (1984). Personality assessment of ATC

applicants. In S. B. Sells, J. T. Dailey, E. W.
Pickrel (Eds.) Selection of Air Traffic Controllers.
Cattell, R. B., & Eber, H. W. (1962). The sixteen

personality factor questionnaire. Champaign, IL:
Institute for Personality and Ability Testing.
Carter, D. S. (1979). Comparison of different shrinkage
formulas in estimating population umultiple correlation coefficients. Educational and Psychological Measurement, 39, 261-266.
Cooper, M., Blair, M. D., & Schemmer, F.M. (1994).

(SACHA) Draft Preliminary Approach Predictors
Vol 1: Technical Report . Bethesda, MD: University Research Corporation.
Cobb, B. B. (1967). The relationships between chronological age, length of experience, and job performance ratings of air route traffic control specialists
(DOT/FAA/AM-67/1). Oklahoma City, OK: U.S.
Costa, P.T., Jr., & McCrae, R.R. (1988). Personality in

Adulthood: A six-year longitudinal study of selfreports and spouse ratings on the NEO personality
inventory. Journal of Personality and Social Psychology, 54, 853-863.
Ekstrom, R. B., French, J. W., Harman, H. H., &
Dermen, D. (1976). Manual for Kit of FactorReferenced Cognitive Tests. Princeton, NJ: Educational Testing Service.
Cobb, B. B. & Mathews, J. J. (1972). A proposed new

test for aptitude screening of air traffic controller applicants. (DOT/FAA/AM-72/18). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office
Fleishman, E.A., & Quaintance, M.K. (1984). Taxonomies of human performance. Orlando, FL: Academic Press.
63
#:435
Gibb, G.D., Smith, M.L., Swindells, N., Tyson, D.,

Gieraltowski, M.J., Petschauser, K.J., & Haney,
D.U. (1991). The development of an experimental selection test battery for air traffic control
specialists. Daytona Beach, FL.
Manning, C.A., Della Rocco, P. S., & Bryant, K. D.

(1989). Prediction of success in air traffic control
field training as a function of selection and screening test performance . (DOT/FAA/AM-89/6).
Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office of
Aviation Medicine.
Hanson, M. A., Hedge, J. W., Borman, W. C., &

Nelson, L. C. (1993). Plans for developing a set of
simulation job performance measures for air traffic control specialists in the Federal Aviation Administration. (Institute Report #236). Minneapolis, MN: Personnel Decisions Research Institutes.
Mecham, R.C., & McCormick, E.J. (1969). The rated

attribute requirements of job elements in the position analysis questionnaire. Office of Naval Research Contract Nonr-1100 (28), Report No. 1.
Lafayette, Ind.: Occupational Research Center,
Purdue University.
Hedge, J. W., Borman, W. C., Hanson, M. A., Carter,

G. W., & Nelson, L. C. (1993). Progress toward
development of ATCS performance criterion measures. (Institute Report #235). Minneapolis, MN:
Personnel Decisions Research Institutes.
Mies, J., Coleman, J. G., & Domenech, O. (1977).

Predicting success of applicants for positions as air
traffic control specialists in the Air Traffic Service
(Contract No. DOT FA-75WA-3646). Washington, DC: Education and Public Affairs, Inc.
Hogan, R. (1996). Personality Assessment. In R.S. Barrett

(Ed.), Fair Employment in Human Resource
Management (pp.144-152). Westport, Connecticut: Quorum Books.
Milne, A. M. & Colmen, J. (1972). Selection of air

traffic controllers for FAA. Washington, DC: Education and Public Affairs, Inc. (Contract No.
DOT=FA7OWA-2371).
Houston, J.S., & Schneider, R.J. (1997). Analysis of

Experience Questionnaire (EQ) Beta Test Data.
Unpublished manuscript.
Myers, J., & Manning, C. (1988). A task analysis of the

Automated Flight Service Station Specialist job
and its application to the development of the
Screen and Training program (Unpublished manuscript). Oklahoma City, OK: Civil Aeromedical
Institute, Human Resources Research Division.
Human Technology, Inc. (1991). Cognitive task analysis of en route air traffic controller: Model extension and validation (Report No. OPM-87-9041).
McLean, VA: Author.
Human Technology, Inc. (1993). Summary Job Analysis. Report to the Federal Aviation Administration
Office of Personnel, Staffing Policy Division.
Contract #OPM-91-2958, McLean, VA: Author.
Nickels, B.J., Bobko, P., Blair, M.D., Sands, W.A., &

Tartak, E.L. (1995). Separation and control hiring
assessment (SACHA) final job analysis report (Deliverable Item 007A under FAA contract DFTA0191-C-00032). Washington, DC: Federal Aviation
Administration, Office of Personnel.
Landon, T.E. (1991). Job performance for the en-route

ATCS: A review with applications for ATCS selection. Paper submitted to Minnesota Air Traffic
Controller Training Center.
Potosky, D. , & Bobko, P. (1997). Assessing computer

experience: The Computer Understanding and
Experience (CUE) Scale. Poster presented at the
Society for Industrial and Organizational Psychology (SIOP), April 12, St. Louis, MO.
Manning, C. A. (1991). Individual differences in air

traffic control specialist training performance.
Journal of Washington Academy of Sciences,
11, 101-109.
Pulakos, E. D. (1984). A comparison of rater training

programs: Error training and accuracy training.
Journal of Applied Psychology, 69, 581-588.
Manning, C. A. (1991). Procedures for selection of air

traffic control specialists. In H. Wing & C. Manning (Eds.) Selection of air traffic controllers:
Complexity, requirements and public interest.
Pulakos, E. D. (1986). The development of a training

program to increase accuracy with different rating
formats. Organizational Behavior and Human
Decision Processes, 38, 76-91.
64
#:436
Pulakos, E. D., & Borman, W. C. (1986). Rater orientation and training. In E. D. Pulakos & W. C.
Borman (Eds.), Development and field test report
for the Army-wide rating scales and the rater orientation and training program (Technical Report
#716). Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences.
Sollenberger, R. L., Stein, E. S., & Gromelski, S.

(1997). The development and evaluation of a
behaviorally based rating form for assessing air
traffic controller performance (DOT/FAA/CTTN96-16). ). Atlantic City, NJ: U.S. Department of Transportation, Federal Aviation Administration, Technical Center.
Pulakos, E. D, Keichel, K. L., Plamondon, K., Hanson,

M. A., Hedge, J. W., & Borman, W. C. (1996).
SACHA task 3 final report. (Institute Report #286).
Minneapolis, MN: Personnel Decisions Research
Institutes.
Stein, E. S. (1992). Simulation variables. Unpublished

manuscript.
Taylor, M.V., Jr. (1952). The development and validation of a series of aptitude tests for the selection of
personnel for positions in the field of Air Traffic
Control. Pittsburgh, PA: American Institutes for
Research.
Rock, D. B., Dailey, J. T., Ozur, H., Boone, J. O., &

Pickerel, E. W. (1978). Study of the ATC job
applicants 1976-1977 (Technical Memorandum
PL-TM-57-14). In S. B. Sells, J.T. Dailey, & E.
W. Pickrel (Eds.), Selection of air traffic controllers (pp. 397-410). (DOT/FAA/AM-84/2).Oklahoma City, OK: U.S. Department of Transportation, Federal Aviation Administration, Office of
Aviation Medicine.
Taylor, D. K., VanDeventer, A. D., Collins, W. E., &

Boone, J. O. (1983). Some biographical factors
associated with success of air traffic control specialist trainees at the FAA Academy during 1980.
In A. VanDeventer, D. Taylor, W. Collins, & J.
Boone (Eds.) Three studies of biographical factors
associated with success in air traffic control specialist screening/training at the FAA Academy.
Schemmer, F.M., Cooper, M.A., Blair, M.D., Barton,

M.A., Kieckhaefer, W.F., Porter, D.L., Abrahams,
N. Huston, J. Paullin, C., & Bobko, P. (1996).
(SACHA) Interim Approach Predictors Volume
1: Technical Report. Bethesda, MD: University
Research Corporation.
Trites, D. K. (1961). Problems in air traffic management: I. Longitudinal prediction of effectiveness

of air traffic controllers. (DOT/FAA/AM-61/1).
Oklahoma City, OK: U.S. Department of Transportation, Federal Aviation Administration, Office of Aviation Medicine.
Schroeder, D. J., & Dollar, C. S. (1997). Personality

characteristics of pre/post-strike air traffic control applicants. (DOT/FAA/AM-97/17). Washington, DC: U.S. Department of Transportation, Federal Aviation Administration, Office
Trites & Cobb (1963.) Problems in air traffic management: IV. Comparison of pre-employment jobrelated experience with aptitude test predictors of
training and job performance of air traffic control
specialists. (DOT/FAA/AM-63/31). Washington,
Schroeder, D. J., Dollar, C. S., & Nye, L. G. (1990).

Correlates of two experimental tests with performance in the FAA Academy Air Traffic Control
Nonradar Screen Program. (DOT/FAA/AM-90/
Tucker, J. A. (1984). Development of dynamic paperand-pencil simulations for measurement of air

traffic controller proficiency (pp. 215-241). In S.
B. Sells, J. T. Dailey & E. W. Pickrel (Eds.),
Selection of air traffic controllers (DOT/FAA/
Shrout, P.E., & Fleiss, J.L. (1979). Intraclass correlations: Uses assessing rater reliability. Psychological
Bulletin, 86, 420-428.
65
#:437
VanDeventer, A. D. (1983). Biographical profiles of

successful and unsuccessful air traffic control specialist trainees. In A. VanDeventer, D. Taylor, W.
Collins, & J. Boone (Eds.) Three studies of biographical factors associated with success in air
traffic control specialist screening/training at the
FAA Academy. (DOT/FAA/AM-83/6). Washington, DC: U.S. Department of Transportation,
Weltin, M., Broach, D., Goldbach, K., & ODonnell,

R. (1992). Concurrent criterion related validation
of air traffic control specialist pre-training screen.
Fairfax, VA: Author.
Wherry, R.J. (1940). Appendix A. In W.H.Stead, &
Sharyle (Eds.), C.P. Occupational Counseling
Techniques.
Yee, P. L., Hunt, E., & Pellegrino, J. W. (1991). Coordinating cognitive information: Task effects and
individual differences in integrating information
from several sources. Cognitive Psychology, 23,
615-680.
66
#:438
Federal Aviation
Administration
DOT/FAA/AM-13/3
2IFHRI$HURVSDFH0HGLFLQH
The Validity
y of the Air Traffic
Selection and Training
g ((AT-SAT))
Test Battery in Operational Use
Dana Broach
Cristina L. Byrne
Carol A. Manning
Linda Pierce
Darendia McCauley
M. Kathryn Bleckley
Civil Aerospace
Medical Institute
p
March 2013
Final Report
#:439
NOTICE
This document is disseminated under the sponsorship
of the U.S. Department of Transportation in the interest
of information exchange. The United States Government
___________
This publication and all Office of Aerospace Medicine
technical reports are available in full-text from the Civil
Aerospace Medical Institutes publications Web site:
www.faa.gov/go/oamtechreports
#:440
1. Report No.
DOT/FAA/AM-13/3
5. Report Date
The Validity of the Air Traffic Selection and Training (AT-SAT) Test
Battery in Operational Use
March 2013
7. Author(s)

Broach D, Byrne CL, Manning CA, Pierce L, McCauley D,

Bleckley MK
FAA Civil Aerospace Medical Institute

P.O. Box 25082
Office of Aerospace Medicine

800 Independence Ave., S.W.
16. Abstract
Applicants for the air traffic control specialist (ATCS) occupation from the general public and graduates from
post-secondary institutions participating in the FAAs Air Traffic Collegiate Training Initiative (AT-CTI) must
take and pass the Air Traffic Selection and Training (AT-SAT) test battery as part of the selection process. Two
concurrent, criterion-related validation studies demonstrated that AT-SAT was a valid predictor of ATCS job
performance (American Institutes for Research, 2012; Ramos, Heil, & Manning, 2001a,b). However, the
validity of AT-SAT in operational use has been questioned since implementation in 2002 (Barr, Brady, Koleszar,
New, & Pounds, 2011; Department of Transportation Office of the Inspector General, 2010). The current
study investigated the validity of AT-SAT in operational use.
Method. AT-SAT and field training data for 1,950 air traffic controllers hired in fiscal years 2007 through 2009
were analyzed by correlation, cross-tabulation, and logistic regression with achievement of Certified Professional
Controller (CPC) status as the criterion.
Results. The correlation between AT-SAT and achievement of CPC status was .127 (n=1,950, p<.001). The
correlation was .188 when corrected for direct restriction in range. A larger proportion of controllers in the Well
Qualified score band (85-100) achieved CPC status than in the Qualified (70-84.99) band. The logistic
regression model did not fit the data well (2=30.659, p<.001, -2LL=1920.911). AT-SAT modeled only a small
proportion of the variance in achievement of CPC status (Cox and Snell R2=.016, Nagelkerke R2=.025). The
logistic regression coefficient for AT-SAT score of .049 was significant (Wald=30.958, p<.001).
Discussion. AT-SAT is a valid predictor of achievement of CPC status at the first assigned field facility.
However, the correlation is likely attenuated by time and intervening variables such as the training process itself.
Other factors might include the weighting of subtest scores and use of a narrow criterion measure. Further
research on the validity of AT-SAT in relation to multiple criteria is recommended.
17. Key Words
Document is available to the public

through the Internet:
www.faa.gov/go/oamtechreports
ATCS Selection, Aptitude,Validity
Unclassified
Unclassified
Form DOT F 1700.7 (8-72)
21. No. of Pages
14
22. Price
#:441
ACKNOWLEDGMENTS
Research reported in this paper was conducted under the Air Traffic Program Directive/
Level of Effort Agreement between the Human Factors Research and Engineering Division
(ANG-C1), FAA Headquarters, and the Aerospace Human Factors Research Division (AAM500) of the FAA Civil Aerospace Medical Institute.
The opinions expressed are those of the authors alone, and do not necessarily reflect
those of the Federal Aviation Administration, the Department of Transportation, or Federal
government of the United States.
Correspondence concerning this report should be addressed to Dana Broach, Aerospace
Human Factors Research Division (AAM-500), P.O. Box 25082, Oklahoma City, OK 73125.
E-mail: [email protected]
iii
#:442
CONTENTS
The Validity of the Air Traffic Selection and Training (AT-SAT) Test Battery in Operational Use-------- 1
Background ------------------------------------------------------------------------------------------------------------ 2
Method -------------------------------------------------------------------------------------------------------------------- 2
Sample ------------------------------------------------------------------------------------------------------------------ 2
Measures ---------------------------------------------------------------------------------------------------------------- 3
Analyses----------------------------------------------------------------------------------------------------------------- 4
Results---------------------------------------------------------------------------------------------------------------------- 5
Discussion ----------------------------------------------------------------------------------------------------------------- 6
References----------------------------------------------------------------------------------------------------------------- 8
#:443
THE VALIDITY OF THE AIR TRAFFIC SELECTION AND TRAINING (AT-SAT)

TEST BATTERY IN OPERATIONAL USE
The air traffic control specialist (ATCS) occupation

is the single largest and most publicly visible workforce in
the Federal Aviation Administration (FAA). ATCSs, also
known as air traffic controllers, or most simply, controllers,
are responsible for the safe, efficient, and orderly flow of air
traffic in the U.S. air transportation system. There are just
over 15,000 non-supervisory controllers working in 315
air traffic control facilities handling 30,000 commercial
and other flights per day. It is an attractive job with a sixfigure income and federal benefits if a person survives
the winnowing process from application to certification. In
the past, less than 4% of applicants successfully completed
the grueling gauntlet of aptitude tests, screens, classroom
training, simulation training, on-the-job training, and overthe-shoulder performance evaluations with live traffic to
become fully certified controllers (Broach, 1998).
The first hurdle in this lengthy process is getting hired.
The FAA projects that it will hire several hundred to about a
thousand new controllers each year between now and 2020
to replace retiring controllers (FAA, 2012). There are three
primary paths to becoming an air traffic controller with the
FAA. The first path is for persons who have previously been
appointed and served as controllers. According to ATCS
hiring data compiled by the Air Traffic Organization (R.
Mitchell, personal communication, October 17, 2012),
about 30% of new controllers have entered the FAA via
this path in recent years, most commonly from the ranks
of military air traffic controllers. The second path is for
persons who completed an ATCS training program at one
of 36 post-secondary educational institutions participating
in the FAAs Air Traffic Collegiate Training Initiative (ATCTI) program. About 35% of new controllers entered the
FAA via the AT-CTI path since 2006. The third path is
for persons from the general public. No previous air traffic control experience or training is required on this path.
About 35% of new controllers have entered service via the
general public path since 2006. There are several other
paths, but they account for a very small proportion of new
hires since 2006. The focus of this paper is on those hired
via the AT-CTI and general public paths.
The U.S. Office of Personnel Management (OPM)
established the qualification standards that an applicant must
meet to enter the ATCS occupation. At a bare minimum,
an ATCS applicant must be a U.S. citizen and have a high
school diploma (or equivalent). In addition, the applicant
must have three years of progressively responsible work experience or a four-year college degree or some combination
of work experience and post-secondary education. A general

public or AT-CTI applicant must also meet two additional
qualification standards. First, an applicant must not have
reached his or her 31st birthday by the time a bona fide
tentative job offer is made and accepted. Second, the applicant must obtain a qualifying score on an aptitude test
for the occupation.
The computerized Air Traffic Selection and Training
(AT-SAT) battery is the aptitude test currently used by the
FAA to assess general public and AT-CTI applicants under
the OPM occupational qualification standard. AT-SAT
has been in operational use since 2002 (King, Manning,
& Drechsler, 2007). Relatively few persons were tested in
2002 through 2005, as the FAA was not hiring many new
air traffic controllers at that time. However, beginning in
mid-2006, retirements from the ATCS workforce surged,
and the pace of hiring new controllers increased substantially.
Since 2006, FAA has administered the AT-SAT battery to
more than 22,000 applicants and hired 6,826 as new controllers via the AT-CTI and general public paths.
Three principal criticisms of AT-SAT have been made.
First, significant differences in score distributions by race
and sex were observed in the course of validation, with
Blacks and Hispanic-Latinos scoring lower than Whites
and women scoring lower than men (Waugh, 2001, p. 44).
The FAA re-weighted the AT-SAT subtests to mitigate these
group differences without substantially reducing validity
(Wise, Tsacoumis, Waugh, Putka, & Hom, 2001; Dattel &
King, 2006; King, Manning, & Drechsler, 2006). Second,
the pass rate was substantially higher than was originally
projected. While a pass rate of about 67% was predicted
by Wise et al. after re-weighting, the actual pass rate in
operational use has been greater than 90% (Department
of Transportation Office of the Inspector General [DOT
OIG], 2010; King et al.). Third, the validity of AT-SAT
as a predictor of training outcomes and job performance
has been questioned. For example, a Congressional committee chairman has expressed particular concern about
whether FAAs screening test identifies candidates potential
to become air traffic controllers (DOT OIG, 2010, p.
1). Barr, Brady, Koleszar, New, & Pounds (2011) found
no completed studies that determined if the AT-SAT
actually predicted job performance success among those
who took the exam, were accepted for Academy training,
and who subsequently entered and completed on-the-job
training in the field (p. 9). They concluded that without
such a longitudinal study, the FAA cannot be sure that the
1
#:444
for range restriction or criterion unreliability. With correction for incidental range restriction, the correlation was .68
(Waugh, 2001). The second concurrent criterion-related
validation study was conducted by the American Institutes
for Research (AIR; 2012). The current operational version
of AT-SAT was administered to 302 incumbent air traffic
control tower (ATCT) controllers. As in the original en
route validation study, two classes of job performance data
were collected: Behavioral Summary Scale (BSS) ratings of
job performance by peers and supervisors; and performance
on the Tower Computer-Based Performance Measure (see
Horgen, et al., 2012). The correlation between an optimallyweighted composite of AT-SAT subtest scores and the composite of the two criterion measures was .42 without any
corrections (AIR, p. 47). These two studies independently
demonstrated that AT-SAT is a valid predictor of ATCS
job performance. The current study develops a third line
of evidence for the validity of AT-SAT by investigating the
degree to which achievement of CPC status at the first field
facility can be predicted from AT-SAT scores.
AT-SAT is accomplishing its original goals of predictability

(ibid). The purpose of the current study is to investigate the
validity of AT-SAT as a predictor of training outcomes: To
what degree does AT-SAT predict successful completion of
on-the-job training in the field?
BACKGROUND
Validity is used here in accordance with the Uniform
Guidelines on Employee Selection Procedures (29 C.F.R.
1607 (2012)), the Civil Rights Act of 1991 (42 U.S.C.
2000e et seq., 2011), and the relevant professional standards and principles for the development, validation, and
use of employee selection test and procedures (American
Educational Research Association, American Psychological Association, & National Council on Measurement in
Education, 1999; Society for Industrial and Organizational
Psychology (SIOP), 2003). Validity refers to the evidence
supporting the inference to be drawn on the basis of a score
on a given test. In personnel selection, the inference to be
drawn is expected future job performance, as represented
by criterion measures (Sackett, Putka, & McCloy, 2012;
SIOP, 2003). Example criterion measures representing
job performance are production rate, error rate, tenure
(retention), job performance ratings, and training performance, including outcomes (14 C.F.R. 1607.14B(3)).
That predictive inference about future job performance is
made on the basis of the statistical relationship between
predictor test score and the criterion measure, where the
relationship is expressed as a correlation. Also known as a
validity coefficient, the correlation mathematically describes
how much the criterion measure changes as a function of
predictor test scores.
Validation is the process of accumulating empirical
evidence about that statistical relationship between test
score and the criterion (or criteria) to support the predictive
inference. Two common approaches to developing validation evidence in personnel selection contexts are predictive
criterion-related validation studies and concurrent criterionrelated validation studies (SIOP, 2003). Empirical evidence
for the validity of AT-SAT as a predictor of job performance
was provided through two concurrent, criterion-related
validation studies. The first study was reported in 2001
(Ramos, Heil, & Manning, 2001a, b). Approximately 1,000
incumbent en route controllers took the proposed test
battery. Job performance data were collected concurrently
in two forms: Behavioral Summary Scale (BSS) ratings
of job performance by peers and supervisors; and the en
route Computer-Based Performance Measure (CBPM; see
Hanson, Borman, Mogilka, Manning, & Hedge, 1999).
The correlation between the test score and the composite
job performance measure was .52 without any corrections
METHOD
Sample
The sample for this study consisted of air traffic
controllers hired in fiscal years 2007-2009. Sufficient time
has elapsed for most persons hired in these fiscal years to
complete the field training sequence, averaging two to three
years. To identify the sample, records were extracted from
the Air Traffic Organizations Air Traffic Controller National
Training Database (ATC NTD) and matched with AT-SAT
examination records at the individual level. The ATC NTD
contains data for persons who reported to a field facility for
on-the-job training (OJT); data for persons who failed or
withdrew from FAA Academy training and did not enter
OJT at a field facility are not in the NTD. The ATC NTD
contained records for 11,450 new hires at field facilities as
of July 2012, of which 6,941 were for general public or
CTI hires. This pool was reduced to 6,865 records after
screening for complete identifiers and duplicates. These
records were then filtered by fiscal year of entry-on-duty
and valid AT-SAT scores, resulting in a sample of 2,569
first facility training records for new controllers. Records
for new hires who left the field facility training for other
reasons (unrelated to performance, per NTD; n=160), who
requested transfer prior to completion of facility training
(n=156), or who were still in facility training (n=303) were
dropped, leaving a total of 1,950 records for analysis.
All of the controllers in the sample had been hired
under vacancy announcements open to the general public
and AT-CTI graduates. Most (69%) were hired under a general public announcement. The sample was predominantly
2
#:445
Table 1. Demographic characteristics and descriptive statistics
Characteristic
Applicants (N=15,173) Sample (N=1,950)
Race/National Origin (RNO) Group
Asian
464 (3.1%)
45 (2.3%)
Black
3,039 (20.0%)
175 (9.0%)
Hawaiian-Pacific Island
77 (0.5%)
6 (0.3%)
Hispanic-Latino
814 (5.4%)
65 (3.3%)
Native American-Alaskan Native
63 (0.4%)
10 (0.5%)
White
8,906 (58.7%)
1,1,73 (60.2%)
Multi-racial1
1,059 (7.0%)
102 (5.2%)
No RNO group(s) marked
738 (4.9%)
96 (4.9%)
Missing data
13 (0.1%)
278 (14.3%)
Sex
Female
3,449 (22.7%)
307 (15.7%)
Male
11,127 (73.3%)
1,330 (68.2%)
Missing data
597 (3.9%)
313 (16.1%)
Age Mean (SD)
25.2 (3.25)
25.2 (2.84)
AT-SAT Mean (SD)
85.87 (9.39)
90.99 (6.27)
Notes:
Two or more RNO groups marked
Table 2. AT-SAT Subtests
Subtest
Dials (DI)
Applied Math (AM)
Scan (SC)
Angles (AN)
Letter Factory (LF)
Air Traffic Scenarios Test (ATST)
Analogies (AY)
Experience Questionnaire (EQ)
Description
Scan and interpret readings from a cluster of analog
instruments
Solve basic distance, rate, and time problems
Scan dynamic display to detect targets that change over time
Determine interior and exterior angle of intersecting lines
Manage virtual production line, box products, perform
quality control
Direct aircraft to destination in low-fidelity radar simulation
Solve verbal and non-verbal analogies
Life experiences, preferences, and typical behavior in
situations
than 70 are not qualified for consideration for employment.

Scores of 70 to 84.99 place an applicant in the Qualified category, while scores of 85 to 100 put an applicant
in the Well Qualified category. Applicants in the Well
Qualified category are considered first for vacancies, with
veterans preference applied in accordance with civil service
rules. Applicants in the Qualified category are considered
if there is an insufficient number in the Well Qualified
category to meet FAA hiring needs. As a consequence, the
sample (persons hired) was largely drawn from the Well
Qualified category (87% of the sample). However, among
all applicants, just 57% are ranked in the Well Qualified
category. In other words, Well Qualified candidates were
over-represented in the sample relative to the applicant
population. The mean AT-SAT score for the sample was
86.29 (SD=6.42), compared to 85.85 (SD=9.41) for the
applicant population.
White (60%) and male (68%). The average age at the

time of entry-on-duty with the FAA was 25.2 (SD=2.8
years). Demographic statistics for the general public and
CTI applicant population (n=15,173) and the sample are
presented in Table 1.
Measures
AT-SAT is a computerized aptitude test of cognitive
abilities, skills, and other personal characteristics identified
through formal job analysis as being required at the time of
entry into the ATCS occupation. AT-SAT consists of eight
subtests: Dials (DI); Applied Math (AM); Scan (SC); Angles
(AN); Letter Factory Test (LF); Air Traffic Scenarios Test
(ATST); Analogies (AY); and the Experience Questionnaire
(EQ). See Table 2 for a brief description of each subtest. A
weighted composite score is computed from subtest scores.
The FAA uses category ranking in the selection of
controllers. Applicants with AT-SAT composite scores of less
3
#:446
Table 3. Training outcome at first field facility as coded in the NTD
Not CPC: Facility Fail

Employment Terminated Prior to Completion
Reassigned to a non-ATC FAA position
Training Discontinued by Air Traffic Manager (ATM)
Training Failure - Pending Human Resources (HR) Action
Employment Termination Letter Issued
Employee Withdrew From Training
Not CPC: Transfer Lower
Reassigned to Another 2152 Facility
CPC: Successfully Completed Training
135
22
10
5
4
2
6.0%
1.0%
0.4%
0.2%
0.2%
0.1%
212
1,560
9.4%
69.2%
Analyses
Three analyses were conducted. First, the simple Pearson product-moment correlation between AT-SAT score
and field training outcome (achievement of CPC status)
at the first assigned field facility was computed, without
corrections for direct range restriction on the predictor
or criterion unreliability. This raw correlation provides a
conservative, lower-bound estimate of AT-SATs validity as
a predictor of field training outcome. The correlation was
then corrected for direct range restriction on the predictor
(AT-SAT) using the Ghiselli, Campbell, and Zedeck (1981,
p. 299) equation 10-12. The corrected correlation provides
a less biased estimate of the AT-SATs validity as a predictor of field training outcome. No correction for criterion
unreliability was made. Second, a 2-by-2 (AT-SAT score
band [Qualified, Well Qualified] by first facility training
outcome [Not CPC, CPC]) F2 analysis was conducted.
The odds of certifying by score band were estimated. Third,
logistic regression was used to model the relationship of
AT-SAT score to achievement of CPC status at the first
field facility. The odds of certifying by AT-SAT score were
estimated from the logistic regression equation. All analyses
were conducted using SPSS version 20.
Successful completion of training at the first field

facility is a desirable outcome for both the agency and
the individual. Therefore, a dichotomous variable was
derived from the ATC NTD data to represent success in
first facility OJT. Individuals who failed, withdrew, or had
training terminated at the first facility were coded as Facility Fail in the ATC NTD. Facility failure can result in
the termination of employment. However, the agency also
has the discretion to offer an individual at risk for failure
a transfer to a lower level, less complex facility if a position is available (FAA, 2006). The ATC NTD coded these
cases as Transfer Lower. Such a transfer is still an adverse
outcome from an agency perspective due to the associated
costs and staffing gap created by the loss at the first facility. Therefore, persons who were categorized in NTD as
Facility Fail and Transfer Lower were coded as having
failed to achieve CPC status (Not CPC; n=390) at the
first facility, while persons categorized as Completed in
the ATC NTD (n=1,560) were coded as CPC (Table 3).
#:447
RESULTS
at the first assigned field facility, compared to 71% of the

261 new hires from the Qualified score band (F2=17.54,
p <.001). New hires from the Well Qualified score band
were 1.86 times more likely to achieve CPC status than new
hires from the Qualified score band (odds ratio confidence
interval=1.39 to 2.49).
The logistic regression of AT-SAT on achievement
of CPC status at the first facility resulted in correct classification of 57.5% of the 1,950 cases, as shown in Table
5. As expected, in view of the modest correlation between
AT-SAT and the criterion measure, the model did not fit
the data well (F2=30.66, p <.001, -2LL=1920.91). AT-SAT
modeled only a small proportion of the variance in the field
training outcome of achieving CPC status (Cox and Snell
R2=.016, Nagelkerke R2=.025).
The simple correlation between AT-SAT score and

achievement of CPC status at the first field facility was .127
(n=1,950, p <.001) without corrections for direct range
restriction on the predictor or criterion unreliability. With
correction for direct range restriction on the predictor, the
correlation between AT-SAT score and status was .188.
The usual tests of statistical significance do not apply to
correlations corrected for restriction in range (SIOP, 2003).
The cross-tabulation of AT-SAT score band by achievement of CPC status is presented in Table 4. Eight persons
scored below 70 on their first AT-SAT examination and
were excluded from the cross-tabulation analysis. Because of
FAA hiring policies, most new hires were selected from the
Well Qualified score band. Overall, 82% of the 1,681 Well
Qualified new hires successfully completed field training
Table 4. Cross-tabulation of AT-SAT score band by field training outcome (expectancy table)
AT-SAT Score Band

Qualified
Well Qualified
Total
Field Training Outcome

Unsuccessful
Successful
77
184
(29.5%)
(70.5%)
309
1,372
(18.4%)
(81.6%)
386
1,556
Total
261
1,681
1,942
Table 5. Logistic regression cross-classification table (cut value=.80)
Observed Outcome
Unsuccessful
CPC
Overall %
Predicted Outcome
Unsuccessful
CPC
217
173
656
904
% Correct
55.6%
57.9%
57.5%
#:448
Odds(CPCat1stfacility)
ATSATWeightedCompositeScore
Figure 1. Odds of achieving CPC at the first field facility by AT-SAT composite score
DISCUSSION
Nevertheless, the logistic regression coefficient for ATSAT score of .049 was significant (Wald=30.958, p <.001).
The odds of certifying at the first assigned field facility were
computed from the logistic regression equation as a function
of AT-SAT score (Figure 1; see Norusis, 1990, pp.49-50).
A new hire with an AT-SAT score of 70 had slightly better than even odds (1.5 to 1) of achieving CPC status. In
comparison, a new hire with an AT-SAT score of 85 had
slightly better than 3-to-1 odds of achieving CPC status.
In other words, new hires with higher AT-SAT scores had
better odds of achieving CPC status at the first field facility
than new hires with lower AT-SAT scores.
The current study investigated the validity of AT-SAT

as a predictor of achievement of CPC status at the first field
facility. The results showed that AT-SAT was a valid predictor of training outcome for next generation of air traffic
controllers. First, the correlation between AT-SAT score
and training status was positive and significant. Second,
persons with higher scores were more likely to certify at
the first assigned field facility than were persons with lower
scores as shown by the F2 analysis. Third, logistic regression
analysis found the odds of certifying at the first facility
increased with AT-SAT score. Taken together, the results
of the present investigation and those of the two previous
criterion-related validation studies show that AT-SAT is a
valid predictor of both OJT outcome (achievement of CPC
status) and, more importantly, of on-the-job performance
after certification. In other words, the empirical evidence
supports the validity of AT-SAT as a personnel selection
procedure for the ATCS occupation.
#:449
The uncorrected correlation between AT-SAT and
achievement of CPC status in this study was small in
Cohens 1988 frequently cited categorization of effect sizes.
In comparison, Bertua, Anderson, and Salgado (2005) reported average uncorrected correlations from .15 to .30
between various types of cognitive ability tests and criterion
measures. Other point estimates of the validity of cognitive
ability tests range from .29 to .51 (Bobko, Roth, & Potosky,
1999; Hunter & Hunter, 1984; Schmidt & Hunter, 1998).
In another meta-analysis, Robbins, et al. (2004) reported an
average correlation of .121 between college admissions test
(ACT, SAT) scores and retention in 4-year college programs.
While the AT-SAT correlation with field training outcome
was low, it is within the range of values reported for other
cognitively-loaded selection instruments.
Moreover, AT-SAT predicted achievement of CPC status
several years after testing despite many intervening variables.
Both time and intervening variables attenuate predictorcriterion relationships (Barrett, Caldwell, & Alexander, 1989;
Barrett, Alexander, & Doverspike, 1992; Beier & Ackerman,
2012; Murphy 1989; Van Iddekinge & Ployhart, 2008). The
average time between testing and completion of field training
or loss was 34 months (SD=10.9 months). It might also be
the case that not all of the field attrition was due to lack of
aptitude. For example, losses might be due to economic factors such as a lack of affordable housing and lifestyle factors
(e.g., lengthy commute or the availability of affordable and
flexible childcare). Losses for these reasons are unlikely to
be predictable from an aptitude test. Better information is
needed to understand and categorize losses in field training
for future investigations of the validity of AT-SAT.
Even though the correlation was modest and despite
the intervening variables, AT-SAT as a selection procedure
could have practical utility. ATCS selection is a large-scale,
high-stakes selection process. ATCS training is expensive,
with an estimated cost per developmental of $93,000 per
year (FAA, 2012). Selection of only applicants from the Well
Qualified score band would have increased the net success
rate to 82%, avoiding 77 unnecessary field failures in this
cohort. Reducing the field failures by 77 persons would have
avoided about $7M ($93,000 x 77 persons) in cumulative
lost costs in personnel compensation and benefits for this
sample of new hires.1
In closing, the current study provides additional empirical evidence that AT-SAT is a valid selection procedure
for the ATCS occupation. Persons with higher scores on
AT-SAT were more likely to successfully certify at their first
field facility. Field attrition among developmental controllers has often been framed as a problem in initial selection
and placement. However, only a small proportion of the
variance in achievement of CPC status was explained by
aptitude test scores collected two or three years earlier, as
evidenced by the small correlation between AT-SAT and
CPC status. There are several possible explanations for this
observation. First, achievement of CPC status is a binary
criterion representing minimally acceptable performance
at the completion of training. Binary criteria inherently
limit the value of any correlation as the distribution shifts
away from a 50/50 split (Ghiselli et al., 1981). In contrast,
multiple criterion measures were used in the concurrent,
criterion-related validation studies, measures that encompassed the broad range of controller work behaviors. Those
criterion measures assessed typical job performance on multiple dimensions from peer and supervisor perspectives and
maximal technical job performance on meaningful interval
scales. Further investigation of AT-SATs validity in relation
to additional criterion measures such as performance in
FAA Academy initial qualifications training, organizational
citizenship behavior, counter-productive work behavior,
job knowledge, and post-CPC technical job performance
are recommended. This will require the development and
collection of psychometrically sound measures of individual
controller job performance. Second, the weights given
to the subtest scores might not be optimal for predicting achievement of CPC status. AT-SAT was originally
weighted to select those whose job performance would be
higher than average; a different weighting approach might
be required to predict CPC status, a far different criterion.
Finer-grained analyses of subtest scores and their weights
are recommended in continuing evaluations. Third, the
relationship of predictor and achievement of CPC status
might be attenuated by time and intervening variables.
Research on the training process itself, as delivered at field
facilities, and investigations into the reasons developmental
controllers do not achieve CPC status are recommended.
Careful attention must be given to the reasons why and when
new controllers leave field training in order to understand
what can be predicted from performance on an aptitude
test battery and what cannot.
1
The actual avoided costs depend on when each ind ividual left field
training. The FAA estimates the cost of training at $93,000/year, or
$7,750/month. If 47 developmental controllers left training after 10
months, 25 at 20 months, and 5 at 30 months, the avoided lost costs
would be (47 x 10 x $7,750) + (25 x 20 x $7,750) + (5 x 30 x $7,750), or
$8,680,000. The $7M figure is a rough-order-of magnitude or benchmark
estimate based on the assumption that attrition occurs in the first year.
#:450
REFERENCES
Cohen, J. (1988). Statistical power analysis for the behavioral

sciences. (2nd Ed.). Hillsdale, NJ: Erlbaum.
American Educational Research Association, American

Psychological Association, & National Council on
Measurement in Education. (1999). Standards for
educational and psychological testing (4th Ed.). Washington, DC: American Psychological Association.
Dattel, A.R. & King, R.E. (2006). Reweighting AT-SAT to

mitigate group score differences. (Report No. DOT/
FAA/AM-06/16). Washington, DC: Federal Aviation
Administration Office of Aerospace Medicine.
Department of Transportation Office of the Inspector General. (2010). Review of screening, placement, and initial
training of newly hired air traffic controllers. (Report
No. AV-2010-049). Washington, DC: Author.
American Institutes for Research. (2012, September).

Validate AT-SAT as a placement tool. (Draft report
prepared under FAA contract DTFWA-09-A-80027
Appendix C). Oklahoma City, OK: Federal Aviation
Administration Aerospace Human Factors Research
Division (AAM-500).
Federal Aviation Administration. (2006). Employment policy

for air traffic control specialist in training. (Human
Resources Policy Manual (HRPM) Volume 1: Employment HRPM Supplement EMP-1.14 (ATCS Employment Policy)). Last retrieved November 8, 2012
from the FAA employee website https://employees.
faa.gov/org/staffoffices/ahr/program_policies/policy_
guidance/hr_policies/hrpm/emp/emp-1-14_sup/
Barrett, G.V., Caldwell, M.S., & Alexander, R.A. (1989).

The predictive stability of ability requirements for
task performance: A critical reanalysis. Human Performance, 2, 167-181.
Barrett, G.V., Alexander, R.A., & Doverspike, D. (1992).
The implications for personnel selection of apparent
declines in predictive validities over time: A critique
of Hulin, Henry, and Noon. Personnel Psychology,
45, 601-617.
Federal Aviation Administration. (2012, April). A plan for

the future: 10-year strategy for the air traffic control
workforce 2012-2021. Washington, DC: Author.
Ghiselli, E.E., Campbell, J.P., & Zedeck, S. (1981). Measurement theory for the behavioral sciences. San Francisco,
CA: W.H. Freeman & Company.
Barr, M., Brady, T., Koleszar, G., New, M., & Pounds, J.
(September 22, 2011). FAA Independent Review Panel
on the selection, assignment and training of air traffic
control specialists. Washington, DC: Federal Aviation
Administration. http://www.faa.gov/news/updates/
media/IRP%20Report%20on%20Selection%20
Assignment%20Training%20of%20ATCS%20
FINAL%2020110922.pdf
Hanson, M.A., Borman, W.C., Mogilka, H.J., Manning, C.,

& Hedge, J.W. (1999). Computerized assessment of
skill for a highly technical job. In F. Drasgow & J.B.
Olson-Buchanan (Eds.), Innovations in computerized
assessment (pp. 197-220). Mahwah, NJ: Lawrence
Erlbaum Associates.
Beier, M.E. & Ackerman, P.L. (2012). Time in personnel

selection. In N. Schmitt (Ed.), The Oxford handbook
of personnel assessment and selection (pp. 721-739).
New York, NY: Oxford University Press.
Horgen, K., Lentz, E.M., Borman, W.C., Lowe, S.E.,

Starkey, P.A., & Crutchfield, J.M. (2012, April). Applications of simulation technology for a highly skilled job.
Paper presented at the 27th Annual Conference of the
Society for Industrial and Organizational Psychology,
San Diego, CA.
Bertua, C., Anderson, N., & Salgado, J.F. (2005). The

predictive validity of cognitive ability tests: A UK
meta-analysis. Journal of Occupational and Organizational Psychology, 78, 387409.
Hunter, J. & Hunter, R. (1984). Validity and utility of

alternative predictors of job performance. Psychological
Bulletin, 96, 72-98.
Bobko, P., Roth, P., & Potosky, D. (1999). Derivation and

implications of a meta-analytic matrix incorporating
cognitive ability, alternative predictors and job performance. Personnel Psychology, 52, 561-589.
King, R.E., Manning, C.A., & Drechsler, G.K. (2007).

Operational use of the Air Traffic Selection and Training Battery. (Report No. DOT/FAA/AM-07/14).
Washington, DC: Federal Aviation Administration
Office of Aerospace Medicine.
Broach, D. (Ed.) (1998). Recovery of the FAA air traffic

control specialist workforce. (Report No. DOT/FAA/
AM-98/23). Washington, DC: Federal Aviation Administration Office of Aviation Medicine.
Murphy, K.R. (1989). Is the relationship between cognitive

ability and job performance stable over time? Human
Performance, 2, 183-200
8
#:451
Norusis, M. J. (1990). SPSS advanced statistics users guide.
Chicago, IL: SPSS Inc.
Schmidt, F.L. & Hunter, J.E. (1998). The validity of selection methods in personnel psychology: Practical
and theoretical implications for 85 years of research
findings. Psychological Bulletin, 124, 262-274.
Ramos, R.A., Heil, M.C., & Manning, C.A. (Eds.).

(2001a). Documentation of validity for the AT-SAT
computerized test battery, Volume I. (Report No. DOT/
Administration Office of Aviation Medicine.
Society for Industrial and Organizational Psychology.

(2003). Principles for the validation and use of employee selection procedures (4th Ed.). Bowling Green,
OH: Author.
Ramos, R.A., Heil, M.C., & Manning, C.A. (Eds.).

(2001b). Documentation of validity for the AT-SAT
computerized test battery, Volume II. (Report No. DOT/
Administration Office of Aviation Medicine.
Van Iddekinge, C.H. & Ployhart, R.E. (2008). Developments in the criterion-related validation of selection
procedures: A critical review and recommendations
for practice. Personnel Psychology, 61, 871-925.
Robbins, S.B., Lauver, K., Huy, L., Davis, D., Langley, R.,
& Carlstrom, A. (2004). Do psychosocial and study
skill factors predict college outcomes? A meta-analysis.
Psychological Bulletin, 130(2), 261-288.
Waugh, G. (2001). Analysis of group differences and fairness. In R.A. Ramos, M.C. Heil, & C.A. Manning
(Eds.), Documentation of validity for the AT-SAT computerized test battery Volume II (pp. 43-47). (Report
No. DOT/FAA/AM-01/6). Washington, DC: Federal
Aviation Administration Office of Aviation Medicine.
Sackett, P.R., Putka, D.J., & McCloy, R.A. (2012). The

concept of validity and the process of validation. In
N. Schmitt (Ed.), The Oxford handbook of personnel
assessment and selection (pp. 91-118). New York, NY:
Oxford University Press.
Wise, L.L., Tsacoumis, S.T., Waugh, G.W., Putka, D.J., &

Hom, I. (2001, December). Revision of the AT-SAT.
(Report No. DTR-01-58). Alexandria, VA: Human
Resources Research Organization (HumRRO).
#:452
EXHIBIT 11
4/ I3nOI4
INFORUM
#:453I Fargo. ND
I
1hi: lo111m
M
of fargo- Moorh~~1d
Anna Burleson, Forum News Service, Published March 05 2014
Want to be an air traffic controller? UND says

FAA has 'dumbed down the process'
GRAND FORKS - The Federal Aviation Administration has leveled the playing field for anyone wanting
to work as an air traffic controller.
But is that a good thing?
Instead of giving preferential treatment to people with degrees in the field, as of February, anyone can be
considered for the job as long as they pass a preliminary test and have a bachelor's degree or three years of
work experience in any field whatsoever.
Paul Drechsel, assistant chairman uf the University of North Dakota's air traffic control program said the
decision was confusing and definitely a cause for concern.
"It's almost like they dumbed down the process," he said. "If I was the flying public I would be very
concerned about this."
FAAS okesman Ton Molinaro said the decision was made to "add diversity to the workforce."
"There's always a need for ATC because by age 55 you have to retire," he said . "That turnover does
happen faster than in a normal workplace."
Molinaro said it's great if people with A TC degrees apply because the process will be a lot easier for them,
but it doesn't necessarily mean they have a leg up.
"We know that we have to hire 'so many thousands' over the next decade, so it's a way to see if we can
find the best applicants across the whole population," he said.
But Drechsel doesn't see it that way.
"We're confused," he said. "We haven't had any feedback yet, but we requested it from the FAA so we can
make adjustments."
UN D's Department of Aviation has assembled a legislative affairs committee to essentially convince state
and local politicians to use their influence to get the decision reversed.
"I think with patience, this will change," Drechsel said.
UND is certified as a Collegiate Training Initiative school, meaning before the rule change, students who
4/13{2014
#:454
INFORUM I Fargo, ND
earned degrees could skip the first five weeks of a 12-week FAA-mandated training session at the Mike
Monroney Aeronautical Center in Oklahoma City.
Now, a person who has no experience in the field can take the course after passing an initial test to measure
things such as one's ability to handle stress.
But Molinaro said it still requires a certain skill set to pass all of the tests and work in ATC.
"We're looking at not just basic knowledge, we're looking at reaction time, working under stress,
multitasking, thinking in three dimensions, things like that," he said.
#:455
EXHIBIT 12
#:456
Applic on
~tatu::.
Application Status for

n
announcement FAAAMC-14-ALLSRCE-33537
Thank you for submitting your application for announcement
F~AMC-14-ALLSRCE-33537. Based upon your responses to the
Biographical Assessment. we have determined that you are NOT
eligible for this position as a part of the current vacancy
announcement.
The biographical assessment measures ATCS job appicant
characteristics that have been shov.n empiricaly to predict success
as an air traffic controller in the FAA. These characteristics in ell de
factors such as prior general and ATC-specific y.,ork experience ,
education and training, y.,ork habits, academic and other
achievements, and life experiences among other factors. This
biographical assessment was Independently validated by outside
experts.
Many candidates applied for this position and unfortunately we
have fewer job openings than there were candidates. We
encourage you to apply to future vacancy announcements. Thank
you again for your interest in the Federal Aviation Administration.
If you y.,ould like further information, please make your request in

v.riting to [email protected].
Application Status Page viewed on:

Time).
Return to USA.JOBS
Central
Would you like to view your application?

View Appllcatlon
Help us improve the application process .

Take a Sl.IM!y
Of.'8 Control: 2120--0699
Sy M yklfoonabon
4/22/2016
AVIATOR
Status
27-1:: Application
Filed 04/25/16
#:457
Application Status
Application Status for Jorge A Rojas on
announcement FAA-ATO-15-ALLSRCE-40166
Thank you for submitting your application for announcement
FAA-ATO-15-ALLSRCE-40166. Based upon your responses to the
Biographical Assessment, we have determined that you are NOT
eligible for this position as a part of the current vacancy
announcement.
The biographical assessment measures ATCS job applicant
characteristics that have been shown empirically to predict
success as an air traffic controller in the FAA. These
characteristics include factors such as prior general and ATCspecific work experience, education and training, work habits,
academic and other achievements, and life experiences among
other factors. This biographical assessment was independently
validated by outside experts.
Your Application
Would you like to view your application?
Feedback
Help us improve the application process...
OMB Control: 2120-0699
Survey Information
Many candidates applied for this position and unfortunately we

have fewer job openings than there were candidates. We
encourage you to apply to future vacancy announcements. Thank
you again for your interest in the Federal Aviation Administration.
If you would like further information, please make your request in
writing to [email protected].
Application Status Page viewed on: April 22, 2016 5:23 PM(Central
Time).
https://jobs.faa.gov/aviator/Modules/MyApplications/ApplyStatus.aspx?vid=40166
1/1
#:458
EXHIBIT 13
#:459
FOIA Program Management Branch

800 Independence Avenue SW
May 21, 2015

Mr. Jorge Rojas
21305 Brighton Ave
Torrance, CA 90501
Re: Freedom of Information Act (FOIA) Request 2015-006130
Dear Mr. Rojas:
This letter acknowledges receipt of your FOIA request dated May 20, 2015, concerning all records
concerning my application for the March 2015 Air Traffic Control Specialist hiring announcement
FAA-ATO-15-ALLSRCE-40166. All emails and other written communications between individuals in the
Office of the Administrator (AOA), Air Traffic Organization (ATO), and Human Resources (AHR) Line''s of
Business..
Your request has been assigned for action to the office(s) listed below:
Air Traffic Organization (AJI-172)
Contact: Melanie Yohe

Regional FOIA Coordinator
(202) 267-1698

FOIA Program Management Branch (AFN-140)
Contact: Susan Mclean

Regional FOIA Mgmnt Specialist
(202) 267-8574

Office of the Chief Counsel, AGC-100
800 Independence Avenue, SW
Contact: Shantell Frazier

FOIA Coordinator
(202) 267-3824

Office of Human Resources
1601 Lind Ave., SW
Renton, WA 98057
Contact: Beth Mathison

FOIA Coordinator
(425) 227-2070
Should you wish to inquire as to the status of your request, please contact the assigned FOIA coordinator(s).
Please refer to the above referenced number on all future correspondence regarding this request.
Sincerely,
Alan Billings
#:460
EXHIBIT 14
#:461
Jorge Rojas <[email protected]>
Rojas v. FAA (FOIA) - Case No. CV 15-5811 CBM (SS) - Status/26(f) Availability
Medrano, Alarice (USACAC) <[email protected]>
To: Jorge Rojas <[email protected]>
Mon, Dec 14, 2015 at 8:26 PM
My understanding is that the agency initially rev

reviewed the validation study fo
fforr the 2014 BA while you
requested the validation study fo
fforr the 2015 BA.
BA. The agency has now reviewed the underlying documentation
for the 2015 BA and I believe that they again concluded that the documents are privileged under the attorneyclient and attorney work product privileges. However, I have not been advised of the final response or whether
there will be a partial release. I will look forward to speaking to you further tomorrow. Please let me know
where you would like me to contact you. Of course, if I receive anything before our meeting, I will forward it
via e-mail.
From: Jorge Rojas [mailto:[email protected]]

Sent: Monday, December 14, 2015 6:47 PM
[Quoted text hidden]
[Quoted text hidden]
#:462
EXHIBIT 15
#:463
Federal Aviation
Administration
Memorandum
Date:
To:
From:
Subject:
FEB 1 1 2016
::to~c{~e*cer,
Air Traffic Organization. AJ0-0
AT-SAT Replacement Validation Studx
The FAA is evaluating potential replacements for the AT-SAT, which bas been used to hire
A ir Traffic Controllers for the past 14 years. The FAA has engag_ed A PTMerrics an external
consulting firm. to assist the agency in this effort. We are asking randomly selected CPCs,
like you, to voluntarily complete a pilot version of these assessments to help us evaluate their
effectiveness as a future selection tool.
Your individual test data will be accessible only to APT Metrics and the third-parties
providing the software and proctoring of the assessments. APTMetrics will also require
supervisors to complete performance ratings. Your individual performance ratings will only
be avai lable to APTMetrics and will be used for validation research purposes only. Neither the
individual test results nor the individual perfonnance ratings will be shared with the FAA or
impact the CPCs participating in the process.
Participants will complete the assessments at a nearby testing center run by PSl, an
independent third-party. T he assessment should take approximately 6 hours to complete.
including a 15 minute break and a 45 minute lunch break, and will be completed during duty
time.
Your Frontline Manager w ill provide you with several options for dates when you can
schedule your participation in the assessment. PSI has assessment centers nationwide. You
wi ll be provided with the locations upon scheduling for your assessment. You must bring one
form of photo ID with you to the PSI location to securely check into the assessment. Please do
not bring additional personal belongings s uch as backpacks or cell phones. To rei terate, your
participation will be during duty time.
Assessments begin January 20 16 through March 31 , 20 16
#:464
2
STEPS FOR REGISTERING
Follow this link: https://candidate.psiexams.com/account/crcate account.jsp

1. Complete the form to create your account and click "Submit"
2. Click "Find My Records'' (in the drop-down menus select "'Federal Agencies., and
then .. FAA Exam Validation Study (APTMetrics)')
3. Enter your Participant ID from your invitat ion
4. Click "Schedule fo r a test" (you will be direcled to search for your nearest testing
site and choose a test date and time)
PLEASE:
Do NOT perform general internet searches in order to determine what PSI sites are
near you or what hours those sites are open. The FAA has made special arrangements
for the project, and you will only get correct information about what sites are available
(and when) by following the instructions above.
Do NOT call the testing sites directly. If you need to contact PSI, please call their
main customer support number: 800-733-9267
If you need additional assistance or have questions please contact. Suzanne Styc, Acting
Director of Resource Enterprise at 202-267-0556.
#:465
EXHIBIT 16
#:466
HR Consulting Services in Atlanta, NY, Vero Beach,

Chicago APTMetrics
Evidence-Based Talent Management Solutions

APTMe
M tr
trics is the only HR consulting firm that builds world-class, customized talent solutions. We are
APTMetricsis
nationally recognized for our employment class-actionlitigation support services.
This combination ensures that the unique talent management solutions and HR consulting services we
deliver are inclusive, fair, valid and legally defensible.
Founded by Kathleen Kappy Lundquist, Ph.D., and John C. Scott, Ph.D., in 1995, our multi-functional
human resource consulting staff of more than 50 industrial-organizational (I-O) psychologists, HR
professionals, IT specialists and other experts devise practical solutions to help our clients connect,
assess, select, develop and retain their top talent.
The Fortune 100 and other organizations around the world trust us to deliver unparalleled talent
management strategies and we can do so for you.
If you want to know how our HR consulting solutions and talent management tools work, get in touch with
us today.
Testimonials
Our people are the single most important in gradient to the Marriott service strategy. APTs expertise in
employment assessment and delivery technologies is helping us to ensure our service standards remain
a competitive advantage as we expand around the globe.
Adam Malamut, Ph.D. Vice President, Human Capital Planning, Analystics and Development - Marriott
International Inc.
We look forward to many future shared successes with APT. Weve tried winning and weve tried losing
and weve discovered that we prefer winning and working with you all certainly improves the odds of
that.
#:467Hudson, Rainer & Dobbs LLP
R. Lawrence Ashe Jr., Esq. Senior Counsel - Parker,
#:468
Litigation Support | HR Consulting Solutions - APTMetrics
APTMetrics
APTMe
M tr
trics I-O psychologists provide employment litigation support and frequently serve as expert
witnesses assisting both defendants and plaintiffs counsels in class-action employment discrimination,
harassment and wage and hour law suits.
Our litigation support services include: examining whether statistical evidence supports the filing of classaction employment-discrimination lawsuits; identifying relevant materials and information that need to be
evaluated to determine whether employment discrimination has taken place; reviewing relevant
documentation to determine whether a test or other employment procedure (e.g., performance appraisal
system) is valid and job-related according to legal and professional guidelines and standards; drafting
questions to be used by lawyers during depositions; conducting job analysis to determine if jobs meet
exemption criteria in wage-hour cases; memorializing findings and conclusions regarding validity evidence
in expert reports; and testifying in court about expert opinions and conclusions.
Criminal Background Check Evaluations

While criminal background checks (CBCs) can serve legitimate business purposes by giving organizations
the ability to safeguard customers and employees, avoid negligent hiring claims and protect company
property, they often also have adverse impact in the hiring process. It is critical that employers ensure that
their criminal background check policies satisfy important business objectives and can withstand legal
scrutiny.
Our criminal background check evaluation service will help you identify appropriate job-related criteria for
excluding criminaloffenders from hire. We have developed a solid methodology for validating CBC criteria
#:469 and legally defensible selection procedures.
that leverages our firms expertise in developing validated
Contact us today to learn how we can help you conduct criminal background checks without breaking the
law.
OFCCP Audit Support
We offer assistance to employers faced with OFCCP compliance evaluations. Our OFCCP audit support
services include:consulting with contractors and their legal counsel to assess risk, reviewing
documentation to determine whether a test or other employment procedure is valid and job-related,
researching adverse impact findings, conducting compensation analyses, establishing the validity of
employment practices, and assisting employment counsel in their negotiations with agency officials.
HR Process Audits
Our HR process audit services are designed to help our client organizations meet and sustain the goal of
providing consistent and fair treatment to their employees. We proactively assess areas where HR
processes can be improved to derive the most value from diverse talent. We use a multi-phase approach in
working with internal or external counsel and our clients HR departments to make recommendations and
implement improvements to HR processes.
Job Analysis for Wage and Hour Issues
Under wage and hour law, the classification of employees as eligible for overtime (non-exempt) or ineligible
(exempt) is based on the type of tasks employees perform at work.
This focus on work performed makes job analysis the ideal tool for ensuring accurate and legally defensible
decisions regarding exemption status. Job analysis allows for the collection of structured verifiable data to
document work requirements and support exemption decisions. For employers that fail to conduct job
analysis on a proactive basis to make exemption decisions, we also conduct post hoc job analyses to
defend against legal challenges.
Consulting Solutions
#:470
&,+
#&##%+ %$%+&#,+!&%*)&-!)&$')!*&3
4 66!&-+,*!$?'*&!2,!'&$(+1 '$'!+,+
4 -%&*+'-*'&+-$,&,+
4
&'*%,!'&, &'$'1+(!$!+,+
++* ')+3

4
4
4
4
*'++!'&$!&,*!,1
.!&8+((*'
&!$0(*,!+
-+,'%*+*.!
!-)*!+0,''#!)
4 *,!!+/'%&8'/&-+!&++
1
4 *,!!+/'%&8'/&+%$$
-+!&++1,

Global
Strategies
for Talent
Management.
#:471
,))*&/')+!*
4
4
4
4
4
4
4
4
4
*++++%&,
%($'1 $,!'&
!,!,!'&-(('*,
!.*+!,1,*,13+-*%&,
'&$1+!+
'%(,&1'$!&
*'*%&&%&,
,!&'***+3)-!+!,!'&+
*&!2,!'&$-*.1+
Global
Strategies
for Talent
Management.
#:472
,)6*&#,+!&%*#+&)$

=
#+!&%
#+!&
&%
&
% =
$'#&0#+!&%0*+$

+%
)* !'****$%+,!+
'
&
&
&

=
&%#0*!*0*+$
0
0
BBE? +)!*
BE?
*=
BBE?
E?6)"0*+$
6
0
,)-0 +)!*
,
0 + **=
)%!1+!&%#,)-00*+$
#:473
944!*
#:474
,)")&,%.!+ *+!%
%!+!+!&%
A 0(*,!,&+++,!%'&1
A '-*,(('!&,0(*,!&+,,$%&,
4
4
4
4
4
A
A
A
A
'*
**'%! 3!,
'*&,&$1
'0'
'8'$'%(&1
&.!,,+,!%'&1'&,+,!&'*
',!,!'&+/!,
*',!. *'++-!,+
.$'(%&,&.$!,!'&'&/+$,!'&+1+,%+
#:475
&##!%,*+!&%>@
A *1'-'&*&'-,$$
$$&,'1'-*,+,+'*!&,*.!/+:
4 +
4 '
4
'&<,#&'/
#:476
!%!+!&%
# ,'

!$) . -.-
6 )3( -/, 5/- --$-!*,)3

(+'*3( ). $-$*)7
#:477
+!**+7
!$# '!""
# !#
#%#("#"
"(""""#

&"#"
!"#( #!"#
%#!"
$#'!
$!#"
!"
!!%$#"
#!%&"
"#( #!#("#"
""""##!"
("#" "#"
#:478
&##!%,*+!&%>A
A
+1'-*'%(&1-**&,$1-+!&/*!,,&
,+,+:
4
4
4
4
4
'4&',,$$
'4-,'&+!*!&-+!&!&, -,-*
+4'*+%$$&-%*'"'+
+4'*/!*&'"'+
.&'!9
#:479
'$$.35
).# 1

#:480
0)*+* ##%7
A .*+!%(,
A $!!,1
4 '&
4
&)-,
A ++.*+
$,*&,!.+
A
&'&+!+,&,
%!&!+,*,!'&
4 ,
4 $!&'&,-!+
-,+!, '%(&1

#:481
-)* $'+!%!+!&%
A !+(*'('*,!'&,$1/*(*',,*'-((($!&,+(++, ,+,, &
%"'*!,1*'-((($!&,+
A +-$$1,*%!&15
4 I?J, +'*MEO*-$
4 ,&*.!,!'&,+,
A (*+&'.*+!%(,%1!.&'*%&1,1(+',+,+
"!=. -.-1$.#0 ,- $(+.)

-/ --!/''3 ! )
#:482
C<D+ *,# ##,*+)+!&%

A H'-,'J*!&8
%*!&+(++, ,+,
A KEO(++*,
A I'-,'J !,+(++,
,+,
A MEO(++*,

($% #&
#:483
)%*!%*+!%*.
A &)*'!!-!%& A )+)$' *!*&%:-#!!+0;
-)*!$'+)(,!)
&'**!%*&)*
4 1'&MEO*-$
4 *,!$&,,!+,!$
!&!!&
4 %($+!2!++-+
A !*')++)+$%+
+*+!%**
4
&'&+!+,&,%!&!+,*,!'&
4 $,!.-+',+,
4 -*1,*!$&(-&!,!.
%+
4 ,!'&$
4 ,,(,$(*'*%&4
&',(($!&,$'/
A #!%+!*9,)%+&!%+!0
#**-)*#+)%+!-*
4 -+,#&'/$
4 -+,%'&+,*,/'-$
$++.*+
4 -+,%'&+,*,+-+,&,!$$1
, +%.$!!,1
#:484
$'#&0)*%+&,**,#
%5
A '&$1++*'&-,
A *'-*+ ..$!!,1
A 1'-%&,+* '*$++.*+
$,*&,!.+
#:485
#!%+!*%+&,**,# %5
A $!(*'-*+*%!&!+,*!&'&+!+,&,$1
A -,+'*+*+,,'' !
#:486
+!*+ : #!;&#!+!&%

+,07
GIN
&-+,*!$&*&!2,!'&$+1 '$'!+,+
&+/*, !+&*$,)-+,!'&+!&*&,+-*.1
B-$1GEEMC
4 !.&'%(,&,$1 '&-,*!,*!'&8*$, '*'&,&,.$!,!'&
+,-14 '/'$/'-$, .$!,!'&+,-1&,','&++!,,
&/.$!,!'&+,-1:
4 .*PJ1*+'$B$-'$$*4-(*.!+'*14&*!$40-,!.
+,+C
4 &!$+,+"-,' .+ '*,*+ $$!PH,'H6J1*+'$
4 %!&!+,*,!.?-+,'%**.!?$+PI,'I6J1*+'$

#:487
+ ),)-0*,#+*
A ++-%!&"'<+,+#+'*/'*# .!'*+'&', &4
'* &.*1$!,,$4/ ,!+, + $$!'"'&$1+!+: ,
!+4,* '/%- ,!%/'-$'&&,'-(,, "'
&$1+!+4.&/ &, **$!,,$'*&' &+,', %"'*
,+#+'*/'*# .!'*+', "':
4 .*PJ,'K1*+'$
A ,!+, + $$!', -,+'*: ,!+4 '/$'&

/'-$1'-*'%%&-+!&-,+'*'*'&-,!&
!,!'&$*+* ,',*%!&!, -,+'*&+
,' &:
4 .*PH,'H6J1*+'$

#:488
+ ),)-0*,#+*
A '&!,!'&+, ,+ '*,&, + $$!'.$!,!'&
+,-15
4 &!&, &,-*', "'-,!+&'*+
4 '&!,!'&+, ,*+-$,!&, %*&'.*+
!%(,'*$$ $$&+
4 &+!&(($!&,('(-$,!'&
4 &+!&%!&!+,*,!'&%'
4 *&!2,!'&$ &+B664%**4'/&+!2!&C

#:489
%+ ,)-00*5
A +*(*'++!'&$$1*!.=*-$+
', -%>
A +, +!&!&+&1'-*(*'++!'&$"-%&,
A #!,!'&$(*'++!'&$"-%&,
A #$$!&(-,
#:490
#.(/-.
3*/*.*0'$.
. -.
#:491
*+#!!+0&&"*!"5
2%(5)250$1&(
(23/(:+26&25(+,*+
217+(7(67$5($/62+,*+
3(5)250(56217+(-2%
(23/(:+26&25(/2:
217+(7(67$5($/62/2:
3(5)250(56217+(-2%
(67&25(6

#:492
#!!+0&#+!&%)&,)
%)%*
A $,!'&(*'-*+(*'.!+%($+
' .!'*/ ! $$'/-+,'%#
!&*&+'-,5
4
4
4
4
4
,+!!$!,!+(*+'&('+++++
,, (*+'&#&'/+
,, (*+'&&'
,(*+'&!+/!$$!&,''
'/(*+'&/!$$ .!&, -,-*
#:493
#!+!&%!%
A $!!,1**+,', *,'/ ! ,+,+'*+
*"'*$,
A (*'++'.$!,!'&!&.'$.+-%-$,!&
.!&,'(*'.!+'-&+!&,!!+!+'*,
(*'('+-+'
, ,+,
#:494
&,)*&#!!+0-!%
A .!&+('&+,'&,&,B'&,&,$!!,1C
4 %'&+,*,!'&, ,, '&,&,',+,!+*(*+&,,!.'!%('*,&,
+(,+'(*'*%&'&, "'
A .!&+'&$,!'&+,', **!$+B*!,*!'&
$!!,1C
4 ,,!+,!$%'&+,*,!'&'*$,!'&+ !(,/&+'*+'&,+,
&"'(*'*%&'+%($'%($'1+
A .!&+'&
&,*&$,*-,-*
B'&+,*-,$!!,1C
4 %'&+,*,!'&, ,,+,%+-*+'&+,*-,B+'%, !&$!.,'
&-&*$1!& -%&,*!,'* *,*!+,!4+- +'&+!&,!'-+&++C
&, '&+,*-,!+!%('*,&,'*+-++-$"'(*'*%&
#:495
&%+%+#!!+0+,0

2%1$/<6,6
(67(9(/230(17
,*!$' %*3$ ).$!3$)" -- ).$'!/).$*)-

-&)*1' " -&$''-$'$.$ -)
+ ,!*,() -.), 0 '*+, +, - )..$0 -(+' -*!
+ ,!*,() *($)- %*-(+' . -.-
%*-&$''. -.-*,%*&)*1' " . -.-

!$/,'$7,21
- -/% .(.. , 2+ ,.%/"( )..*

*/( )., '.$*)-#$+- .1 ). -.
*). ).)+ ,!*,() *($)-

(7$66,1*&25(6
- $)/( ).-)-/% .(.. , 2+ ,.-

.* -.'$-#+--$)"-*, -
#:496
0 **,*!%&%+%+#!!+0
A '%(* &+!."'&$1+!+
A '%(,&!&,+,'&+,*-,!'&
A +,'&,&,*$,,'"'<+'&,&,
A +,'&,&,*(*+&,,!.'"'<+'&,&,
A 0%!&,!'&'$++.*+$,*&,!.+
A (++!&+'*, ,+$,+, '+/ '&
,,*(*'*%, "'
#:497
)!+)!&%6#+#!!+0+,0
2%1$/<6,6
(9(/2325&48,5((676
(9(/23
(5)250$1&(($685(6
5<87,/27
5<87,/27
2//(&7(67$7$
33/,&$1762503/2<((6
2//(&7(5)250$1&($7$
(/$7((67&25(6
(5)250$1&(($685(6
67$%/,6+'0,1,675$7,9( 6(
#:498
0 **,*!%)!+)!&%6#+#!!+0
A )-1'"'(*'*%&*!,*!
A (+1 '%,*!)-$!,1',+,&
*!,*!'&%+-*
A *''**$,!'&&++*1,'
+,$!+ .$!!,1
A 0%!&,!'&'$++.*+$,*&,!.+
A ((*'(*!,&++', (++!&+'*
#:499
%%&,#!+7
A '.*+!%(,
A *&+('*,!&.$!!,1*'%&', *"'
'*$',!'&
A &*$!2!&.$!!,1*'%', *+,-!+
'+!%!$*"'+
#:500
*'&%*!!#!+0&)#!+!&%
A $!,!'&!+, "'!&,*+('&+!!$!,1',+,
.$'(*&,+,-+*
A &, -+',+,!*+*'%, ,+-(('*,
1, ,+,.$'(*4, ,+,-+**++(!$
*+('&+!!$!,1'*.$!,!'&
#:501
+%&,*+&)7
A '&$1+!+/!$$!&,!1
/ ,,'++++
A
,!+&',&++*1,'
%+-*.*1!%('*,&,
9
A
,!+&++*1, ,.*1
%+-*
!%('*,&,9
#:502
&%#0*!*!*+ &,%+!&%
"
"
"

2
%

1
$
/
<
6
,
6
"25.&7,9,7,(6(5)250('
! ' $

$$$
'&"!%
&23($1'))(&72)"25.
%&#&"!%
(&+1,&$/.,//6(48,5('
!"$ &!$%
&!$
"$$" "&"!%
203(7(1&,(6(48,5('
'8&$7,21(48,5(0(176
;3(5,(1&((('('
&$'&'$!&$()%
$"$ !
!
&!$%
#:503
&##!%,*+!&%>B
A '+1'-*'%(&1-+'*%$"'&$1++
+, +!+'*+$,!'&(*'-*+:
4 '
4 +
#:504
/$'#&*+'!!+!&% +)!/
#:505
0'*&*+*+&&%*!)
#!
4 *")$.$0 $'$.3 -.$)"
4 *)*")$.$0 -/, 4 ). ,0$ 1

4 )*1' " -.$)"
4 ,!*,() -- --( ).
4 ). ,0$ 1
#:506

#!!+0
#:507
#+!%*+
A = ! !$!,1>%+-*%&,,''$+B/'*#+%($+4
.!'4++++%&,&,*+C*%'*(,$
,'&!,+
A &,+,+(!!,!'&+ .&.$'(4!5
4 -+,'%+!&'*!&,!1'%%*!$$1
.!$$,+,:
4 ,!+((*'(*!,,+,!&%!-%:
4 ! *!$!,1P'*'+,$1
A , *-+,'%'*'%%*!$$1.!$$4%-+,
.$!,'*1'-*"'+
#:508
&$!%****$%+*
"
$"$ !
"!&(
&$'&'$
!&$()
"!
"!&(
#:509
)!+)!&)*+#!* !%'')&')!+
,+6&&)*
A -,8'+'*++ '-$5
4 '&+!+,&,/!, &'*%$0(,,!'&+'(*'!!&1
/!, !&, /'*#'*
4 *%!,, +$,!'&')-$!!(($!&,+
4 $$'/&'*&!2,!'&,'%,!*%,!.,!'&'$+
4 .'-%&,*,!'&$
#:510
+ "**+*!)%%*!#7
#!!+0
$'#$%++!&%
+'&"'&$1+!+
*!&!&
,&*!2
&'!&%'&!,'*!&
'&+!+,&,
(($+(*'++
$!,
'%%-&!,!'&
#:511
%&!% &%!+&)!%
4 '9(56(03$&7
4 (67217(17
4 '0,1,675$7,21668(6
#:512
&'!-,*+!&%*&,+*+!%
F6
=*&<,,+,+ **,'&, &', *

+$,!'&(*'-*+:>
G6
=
; ,+, +&'.*+!%(,6<
.&'(*'$%+4*! ,:>
H6
=
; ,+,!+.$!,@ *-+,-+9<
.&'(*'$%+4*! ,:>
I6
=&!,+&,#, !+&/,+,, '%'.*

,
&,*&,7+'/&*-'.* 6
.&'(*'$%+4*! ,:>
J6
=&, ,+,!+.$!,4/<*'&4*! ,:>
#:513
*+#!!+0,)-0
'*'(1', *+-$,+'<+*&,+,$!!,1
$ !-*.1%!$1'-**)-+,,'5
&'D,*!+6'%
#:514
&%++ %&)$+!&%
2 %4
& '*&$!*$
'&$''*
*!&4EKMGE
GEH6KJJ6LLLN
$&,'$-,!'&+D 6'%
///6 6'%
Global
Strategies
for Talent
Management.
#:515
EXHIBIT 17
#:516
U.S. Deportment
of Transportation
Assistant Administrator for

Human Resource Management
800 Independence Ave nue, SW

Federal Aviation
Administration
October 8, 2014
Mr. Kyle Nagle
Dear Mr. Nagle:

Thank you for your March 1, 2014 letter, submitted a letter to Vice President Biden, about the
redesigned hiring process for the Federal Aviation Administration 's (FAA) Air Traffi c Control
Specialist (ATCS) position.
As you are aware, in February the FAA announced the establishment of the Interim Hiring
Process, by which interested individuals applied for, were evaluated, and hired for training for
ATCS positions. Jn response, the Agency received approximately 28,000 applications for
approximately 1,700 ATCS positions.
I want to assure you that the FAA's goal in implementing the Interim Hiring Process was to
ensure the Agency selects applicants with the highest probability of successfull y completing
our rigorous air traffic controller training program and achieving final certification as an
ATCS. The Agency's training program is the primary method for ensuing we employ highlytrained controllers committed to maintaining the highest safety standards in support of the
National Airspace System.
A group of unsuccessful applicants for ATCS positions have initiated a class action lawsuit
challenging, among other things, the FAA 's use of the Biographical Assessment. Given the
pending litigation and our ongoing evaluation of the merits of likely claims, I trust you
understand our sensitivity at this time about releasing detailed information about the Interim
Hiring Process. Nevertheless, in an attempt to be responsive, we are providing yo u as
complete information as possible in response to inquiries from you and other Members of
Congress.
The Interim Process differed from prior Agency practice for hiring A TCS primarily in two
ways. First, we created a single, nation-wide vacancy announcement and a single process to
evaluate and assess applicants. ln the past, separate vacancy announcements and distinct
processes were used to evaluate app licants based on whether the applicant met the specified
eligibility requirements for the vacancy announcement. Second, an applicant had to achieve a
passing score on a new component of the hiring process - the Biographical Assessment.
Upon passing the Biographical Assessment, applicants were eligible to take the Air Traffic
2
#:517
Selection and Training (AT-SAT) exam, on which they also had to achieve a passing score.
Below is a more detailed description of the Interim Hiring Process, its purpose and the
process used in its development.
Summary of the New Interim Hiring Process

The Interim Hiring Process established a five-tiered applicant assessment process; applicants
are required to successfully complete each step before being eligible for the next step of the
assessment process:
Biogra hical Assessment - The Biographical Assessment measures an applicant's

education, academic achievement, aviation-related experience, and prior air traffic
control-related experience and achievement orientation. It was rofessionallx develo e
and validated based u on years of extensive research of the ATCS occupation in
accordance with relevant rofessional standards and legal guidelines for re-emP-lo men
selection testin
Baseline Employment Eligibility Screen - Screens for citizenship, Select Service

registration, as well as minimum experience/education position specific qualifications
required by the FAA and the Office of Personnel Management' s Qualification Standards
for the Air Traffic Control Job Series, 2512;
AT-SAT Exam - Measures cognitive abilities and personal characteristics shown

empirically to predict success as an air traffic controller, including mathematical ability,
decision making, spatial information comprehension, working memory, sustained
attention object projection, perceptual speed and accuracy, and planning, among others;
Air Traffic Organization Conditional Offer Letters - Successful applicants receive a

conditional offer to enter the FAA ' s ATCS training program; and
Mandatory Pre-Employment Clearance - Consists of a medical evaluation (drug test,

physical and psychological examination) and a background investigation to determine
suitability for employment and eligibility for a security clearance.
Those who successfully complete the above fi ve-stage assessment then are employed as
ATCS trainees. Trainees are required to pass a rigorous training program at the FAA
Academy located at the Mike Monroney Aeronautical Center in Oklahoma City, Oklahoma.
Successful completion of academy training within uniformly applicable time limits is
followed by assignment to an air traffic facility, where the A TCS trainee serves on-site in a
developmental training status until they achieve Certified Professional Contro ller (CPC)
status.
Prescreening applicants on the Biographical Assessment prior to allowing them to take the
AT-SAT resulted in considerable financial savings (over $7 million), shortened the hiring
cycle and helped the Agency meet its goal of hiring the applicants most like ly to succeed as
an ATCS.
3
#:518
Process Used to Develop the New Interim Hiring Process

Pursuant to the requirements of the Equal Employment Opportunity Commission's (EEOC)
Management Directive 715, the FAA conducted a Barrier Analysis of the FAA ' s Air Traffic
Control Specialist centralized hiring process. Management Directive 715 requires agencies to
regularly evaluate their employment practices to identify barriers to equality of opportunity
for all individuals. Where such barriers are identified, Management Directive 7 l 5 further
req uires agencies to take measures to eliminate them.
The Agency's Barrier Analysis Report was prepared with the expert assistance of Outtz and
Associates and APT Metrics, Inc., two independent consultancies with nationally recognized
expertise in the design and validation of pre-hire employee selection tests. The analysis
examined a variety of qualitative data such as stakeholder interviews, site visits, as well as
quantitative data such as AT-SAT testing data and other data contained in the Agency's
Automated Vacancy Information Access Tool for Online Referral (A VIATOR).
Additionally, the Barrier Analysis reviewed the recommendations of the separate Independent
Review Panel.
Regarding your concern that diversity was a factor in the selection process, let me make it
clear that neither race, national origin, gender, nor any other prohibited factor play any role in
the interim selection process. I can assure you that diversity was not a factor in determining
applicant eligibil ity or who was referred and/or selected for employment with the FAA.
While information on race, national origin, and gender was collected from applicants on a
voluntary basis by the USAJOBS application system, applicants were not required to submit
race, national origin, or gender information in the application process. The infom1ation
collected by USAJOBS on applicants' race, national origin, or gender was not accessible by
the selecting officials making selection decisions. The race, national origin and gender
information submitted or omitted by the applicants had no impact on applicant eligibility.
"Scoring" of the Biographical Assessment

The Biographical Assessment was scored using an automated process based on predefined
question weighting. Unlike skills tests that have questions with unique correct or incorrect
answers, questions on the Biographical Assessment were pre-assigned weight according to
how well they predict a candidate successfully reaching full certification at a facility.
We have received questions about the scoring of the Biographical Assessment along with
requests for individual scores and the score needed to pass the assessment. Disclosure of the
Biographical Assessment items and the basis for scoring and weighting given to each question
would diminish the validity and utility of the instrument for the selection of persons into the
ATCS occupation. The release of this information would materially and negatively impact
the Agency's interest in the selection of persons most likely to succeed in the occupation and
undercut years of research that have been conducted on these items. Disclosure of the basis
for scoring and weighting would enable future test takers to artificially inflate their scores on
4
#:519
the instrument thereby giving them an unfair advantage in competing for a job under merit
principles.
While we are unable to share the specifics of the question weighting or individual scores, we
can share the minimal passing score for the Biographical Assessment was based on the
professionally developed test-validation study and is set to predict that 84 percent of the
appl icants who passed the Biographical Assessment would be expected to successfully
complete the FAA Academy and achieve CPC status.
Treatment of CTI Graduates Under Prior and the Interim Hiring Processes
The FAA created the AT-CTI program to establish partnerships with post-secondary
educational institutions to encourage interest in employment opportunities in the aviation
industry as a whole. The AT-CTI program was not designed or intended to serve only the
FAA to the exclusion of the employment opportunities in the aviation industry nor was the
program designed or intended to be the FAA's only source of applicants for ATCS positions.
The FAA has always used the AT-CTI program in conjunction with other recruitment sources
when hiring A TCS. Because we implemented the Biographical Assessment as an initial
screening process for the 28,000 applicants for the A TCS position, not all AT-CTI students
that were eligible under prior vacancy announcements were found eligible under the February
vacancy announcement. It should be noted however, that 65 percent (1 ,034 of the 1,5 91) of
individuals who received a tentative offer of employment has some combination of AT-CT[
schooling, veterans' preference, or some specific aviation-related work history and
experience. In addi tion, under the Interim Hiring Process, AT-CTI students and graduates
received conditional offers of employment at three times the rate of non-AT-CTI students and
graduates.
While your letter did not request demographic information about non-AT-CTI program
students and graduates, you may also be interested to know that of the approximately 1,59 I
applicants who received tentative offer letters during the interim hiring process approximately
904 disclosed their demographic data (race, national origin, and gender). Of the 904
approximately: 650 (7 1percent) were male and 260 (29 percent) were female; 544 (60
percent) were White; 153 (17 percent) were Hispanic or Latino; 92 (I 0 percent) were Black or
African American; 57 (6 percent) were Asian; 48 (5 percent) were Multi-ethnic; 6 (I percent)
were Native Hawaiian/Pacific Islander; 4 (.4 percent) were American lndian. Please note that
demographic data was not accessed or used during the selection process. Indeed, information
about a test-taker's demographic identity was not available to FAA decision makers involved
in the applicant assessment process under the Interim Hiring Process.
FAA's Continued Relationship with AT-CTI programs
AT-CTI programs are an essential component of the FAA ' s multi-faceted program to ensure a
predictable supply of highly skilled air traffic controllers in the years to come. The programs
are important to the FAA and to the aviation industry. The FAA will continue to work with
AI-CTI schools to encourage interest in employment opportunities in the a via ti on industry
5
#:520
generally and with the FAA, specifically. AT-CTI students and graduates are encouraged to
apply to FAA vacancy announcements in which they feel they are qualified.
In sum, we will continue to monitor our recruitment and assessment strategies to ensure we
hire the best qualified individuals into the ATCS profession. Our commitment to aviation
safety remains our top priority, and these changes to our hiring processes serve to enhance the
effort.
If I can be of further assistance, please contact me or Roderick D. Hall , Assistant
Administrator for Government and Industry Affairs, at (202) 267-3277.
Sincerely,
~~!or~
Human Resource Management
Enclosure
Transmitted Correspondence
cc: White House Office of
Presidential Correspondence
#:521
..
f)(57 .
03/01/2014
Vice President Joe Biden

Old Executive Office Building
Washington DC 20500
Re.: Hiring of air traffic controllers, cancellation of CTI/ AT-SAT consideration
Dear Mr. Vice President,
My name is Kyle Nagle. I ~m writing today concerning sudden, costly, and
unethical changes in the FAA's hiring plan for new air traffic controllers, with speciflc
regard to the abrupt disregard for CTI students. At the Community College of Beaver
County, I have spent the past 3 years-and thousands of dollars-in the FAA's Air TrafficCollegiate Training Initiative (AT-CTl) program. The AT- CTl program has, for a quarter of a
century, been the primary means for the preparation of aspiring cor.ttrollers. Solid proof has
been given on numerous occasions-both through FAA reports and third-party studiesthat the AT-CTI program produces qualified, capable controllers who go on to experience
great success in careers as controllers with the FAA. Nevertheless, on December 30th,
2013, the FAA announced, without warning, that they were abandoning this method
of hiring almost entirely, by announcing a new off-the-street hiring program. The reasons
behind this change are suspicious, an~ the change Itself is contrary to a great wealth of
information which suggests that It cannot be of benefit to the government, the applicant,
or the nation.
; ~et me take the time to further inform you of my own personal journey. I'm 29 years old
.~ prlglnally from eastern Pennsylvania and have been on a long educational journey which
has led me to where I am now. My aviation education started in 2005 where I decided to
no lor)ger study music at Slippery Rock University of Pennsylvania. I began flight training
and through years of balancing studying, part-time work, flying. and multiple flight
schools (due to closures) I was able to obtain my multi- engine commercial pilot certificate
with an instrument rating. My curiosity in the Air Traffic side of aviation was peaked after
repeatedly hearing from multiple aviation professionals about the best school In the nation
for air traffic control. I researched how to get hired as an FAA air traffic controller and
found the FAA CTI program listed on the FAA's website and found the very school listed on
the site I had heard so much about. That school was the Community College of Beaver
County and it happened to be in my great state of Pennsylvania. In 2011 was able to
obtain a part- time transfer with the company I have worked for over the past 6 Y.t years
(at&t mobility) and enroll at CCBC. Working for at&t has allowed me to at least break even
and pay $800 on my S80k plus student loan debt. At CCBC I was able to apply my
previous aviation experiences and In August of 2013 I graduated with a 3.9 GPA. I wa~ 26
at the time and knew that I had to make sure I do my best because I would be 28 by
graduation time. That means not only must I complete the CCBC program on time, I also
#:522
only have one shot at passing the AT-SAT (test In which could not be retaken until after a
year and If passed your score is good for 3 years). This is significant due to the fact that
one must be hired by the FAA to be an air traffic controller before their 31st birthday. After
graduation I decided to stay living in western PA in hopes that I had performed well enough
in the CCBC program to actually be hired as an air traffic controller in the air traffic control
tower CCBC operates to train Its students. Fortunately after 6 months of waiting that day
came in May of 2013. I have been able to work as an air traffic controller (although a nonFAA air traffic controller) and train students while waiting to be hired by the FAA. Recently,
I was also hired to teach one of the final ATC classes students will take at CCBC. Since I
graduated there have been two FAA CTI hiring announcements. The first was in August of
2013 which my application was not considered. The FAA had decided to only consider
those applicants who had previously applied to hiring panels before this August 2013
panel, automatically disqualifying anyone who Is applying first time. The second en
announcement came sometime first quarter of 2013. The FAA used the applications of
those who applied to the August 2Q 13 panel (myself included) for this announcement.
However, due to sequestration/budget Issues again my group was no considered and that
panel was completely scrapped. That brings us to today. The FAA will now be coming out
with a public hiring announcement (anyone can apply). Furthermore, previous en students
like myself must re-take the AT-SAT. The clock is ticking for me in terms of age as well as
so many others. I befleve my hard work, determination, and skill set should be considered
In the hiring process. However, I do not believe this new process will take into account
these facts as well as the possibility of me aging out.
In brief, the new off-the-street hiring will not consider whether a person is a graduate of a
CTI school, and will not consider the applicant's score on an aptitude test (the AT-SAn
which was specifically designed to determine-and has been shown to be an excellent
predictor of the suitability of applicants. Rather, a "biographical questionnaire" Is to be
Introduced. These changes are pursuant to a Barrier Analysis which was conducted In
recent years-Itself an odd notion. If you refer to Page 44 of the FAA's A Plan for the Future
10-Year Strategy for the Air Traffic Control Workforce 2013-2022, you will find that the
FAA's goal is to maintain a pool of 2000-3000 applicants at any given time. At the end of
FY2012, that pool contained more than 5000 persons-many of them CTI graduates. What,
then, motivated the Barrier Analysis which prompted the new hiring protocols?
The Barrier Analysis was the latest attempt of many, over the years, to understand why the
air traffic workforce is less diverse than Is ideal. In so attempting, the conclusion appears
to have been reached that the FAA should seek out new applicants explicitly on the basis of
race, et alla. To quote from page 152 of the Barrier Analysis itself, "[Race and National
Origin} and gender diversity should be explicitly considered when determining the sources
for the applicants ... "
s 10-140014-035
--
#:523
This recommendation resulted from the analysis' finding that 4 of 7 hlrlng phases resulted
In adverse action against minorities. But adverse action is not a term to be used lightly. It
is, specifically, any action taken in the employment process which results in discriminatory
hiring practices. One of several mistakes made by Dr.s Outtz and Hanges in the Analysis (a
report which concludes, on page 155, by stating that the Analysis "was rendered
unacceptable" due to extreme time limitations) was to confuse correlation for causation.
Yes, there are problems with diversity; no, the FAA's hiring process is not the cause of
them. As the Analysis itself shows, as Page 16 of the FAA Independent Review Panel on the
Selection, Assignment and Training of Air Traffic Control Specialists clearly expresses (an
air traffic control trainer urging, "Please do not send me any more public hlresi and as can
be found in any of the Investigations into the validity of the AT-SAT battery of aptitude
tests, the existing hiring process of utilizing CTI schools coupled with AT-SAT testing
produces highly successful and qualified candidates who invariably outperform off-thestreet hires and even persons with veterans referrals. It is quantifiably, unmistakably,
outstandingly clear that the CTI program Is successful, that the AT-SAT Is an outstanding
predictor of excellence, and that there are thousands of qualified candidates ready and
waiting to be hired from this combination.
And yet, the FAA has chosen, for all intents and purposes, to abandon all of this.
Sir, It is a noble goal to ensure diversity in the workplace. The new hiring program appears
to be a last-ditch effort at achieving this diversity against all odds. I do not know
why diversity is problematic, but I do know three things:
Firstly, the AT-SAT exams and CTI schools are not the causes of problems in diversity.
Attempt after attempt to modify hiring processes and reweight test scores have failed
because, in the final product, when it comes to only hiring those who, when all is said and
done, are most capable, the adverse impact still remains. Something cultural-something
fundamental-is the cause of these problems, not FAA hiring policy.
Secondly, I have spent tens of thousands of dollars going to school because it was made
very clear that that was the preferred, and at times only, method of becoming an air traffic
controller. The 36 CTI schools have invested millions of dollars in designing curricula,
hiring instructors (often former controllers) Installing simulators and equipment, and
coordinating internships with ATC facilities. Now, alt of that investment appears to have
been for naught.
Thirdly, uexplicitly" considering applicants on the basis of race, national origin, or
gender- and especially when doing so instead of on the basis of their relevant educational
510-140814-035
#:524
background or aptitude test scores-is not only discriminatory. but potentially dangerous.
insomuch as It alms to diversify a workforce by looking at non-relevant traits before. and
instead of, those which have been shown, over and over. to matter signlflcantly.
I understand and appreciate your effons to bring opponunltles for all. I feel like I have
done everything I can and should have done to pursue my dream. My results of this new
hiring process just came In a days ago. Regardless of my to years of responsible
progressive work experience 140+ college credits and FAA CTI Associates Degree. FAA
Control Tower Operator certificate, FAA multi-engine, commercial pilot, and Instrument
ratings, and other qualifications that were asked on the BQ I was denied. At this point I
feel helpless and could use some support. Please consider reviewing this matter in
preserving fairness for alt. I've always viewed your Presidency as finally having someone
who can understand the Issues that average Americans face. From reading your books and
hearing about your life I feel comfortable sending this letter to you and knowing the very
least your staff would see It. If it crosses your desk even better.
Sincerely,
Kyte Nagle
#:525
Please look Into these matters, please encourage the FAA to provide preferential treatment
for CTI graduates as they have in the past, and please consider the following points in so
doing:
Off-the-street hiring has been shown, repeatedly and concretely, to be less effective
than hiring CTI graduates.
Off-the-street hiring is more expensive for the FAA, as it must train new hires
"from scratch", including costs sunk in those who fail the training program.
Without requiring a college degree, as the CTI program now does, the new hiring
scheme lowers standards in general. A person who failed out of college In the CTI program
Is now eligible, provided they have three years of work experience.
The FAA's website has, for years, made very clear that the only paths into air
traffic control are prior experience or the CTI program. This dearly implies some
significance to the en program and has, thus, been enormously misleading for the
thousands of students who have Invested In that program-most of them borrowing huge
sums of money from the federal government to finance their educations (in other words,
the result of this new hiring scheme is direct, quantifiable, and substantial harm to
thousands of young Americans) .
The FAA misled the 36 CTI schools, who now find that substantial portions of
their educational frameworks serve no purpose. The cancellation of a program is not, in
Itself, offensive. Doing so without advanced notice, and for reasons not only dubious
but, indeed, proven to be ineffective, is strikingly unethical and distasteful.
The Barrier Analysis contains numerous mathematical and typographical errors,
likely due to the aforementioned acknowledgement by the analysis' authors that It was
rushed and that such hurrying compromised its usefulness.
The Barrier Analysis, in finding adverse impact, appears to omit several
enormously Important considerations. It refers to and appears only to consider 4-year
degrees or 4-year schools, yet 1S of the 36 CTI schools offer two-year degrees (which the
FAA has found perfectly suitable for hiring) and many are community colleges. In other
words, the diversity represented in CTI schools is greater than the Analysis Indicates.
10.
Hoe
1:t-035
- -.- -
--
'
.l
#:526
The new hiring scheme was presented without consulting stakeholders. In itself,
this suggests underhanded dealings. as it Is clear that had involved and invested parties
been participants in the discussion. the myriad concerns and problems mentioned here
would have been brought to light sooner.
It very much appears that some small contingency within the FAA. or some party
presenting external pressure, has influenced the decision- making process in an
Irrational, Irresponsible, and legally questionable manner. The new hiring scheme Is
clearly targeted at meeting racial quotas, which, otherwise known as racism, Is patently
Immoral and quite likely Impermissible.
s 10 14 081 4~ 035
#:527
.}
-- .
.'IL I
._ _ 1 .
(
'
------
--- .
.....
0
Ul
(\I
RECEIVED
MAR 12(014
\
\
OD Mail Operation
--- - ------A~---~--
--------
#:528
-...u1 , ,: 1lt
t.f
~~.I\~ \\\\ \\l, 1;1,:lhill'-'l'.~~1 \ \ , JfH~h.'I
\ ... '-\
From: Mr. Kyle Nagle

Submitted: 7/23/2014 9:59 AM EDT
Email:
Phone:
Address:
Subject: AGL_Casework_50_07232014_095712.pdf
Message:
C-us..: Con11m:11t :
DOT Control No. Sl0-140814-035
Jll'1~ 1 r i
:r.; 111- " .:- t- 1''\J.; ~~ i_,

! I' \ .., ';. ..
#:529
Correspondence
Cover
Sheet
510-140814-035
#:530
A.GL8w8DOT
510-140814-035
#:531
EXHIBIT 18
#:532
1
2
3
4
5
6
7
Attorneys at Law
814 W. Roosevelt
(602) 258-1000 Fax (602) 523-9000

[email protected]
[email protected]
10
WESTERN DIVISION
11
12
13

Plaintiff
vs.
No. CV 15-5811 CBM (SSx)

JORGE ALEJANDRO ROJAS
AFFIDAVIT
14
Federal Aviation Administration;

15
Defendant
(Assigned to the Honorable Consuelo B.

Marshall)
16
17
18
19
20
21
I, Jorge Alejandro Rojas, swear as follows:

1. I am the Plaintiff in this matter and have information concerning this case as a
result of my involvement with Defendants, Defense Counsel and my counsel.
2. If called to testify as a witness in this case, I will be able to testify as to the truth
of each and every statement enumerated herein.
22
3. This affidavit is written in support of Plaintiffs Response to Defendants Motion
23
for Summary Judgment, Plaintiffs Controverting Statement of Facts and Separate
24
Statement of Facts in Support of Plaintiffs Response to Defendants Motion for
25
Summary Judgment.
26
27
#:533
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
4. The purpose of this civil action includes identifying the ability of the 2015
Biographical Assessment to identify characteristics needed for the Air Traffic Control
Specialist position.
5. Based on a review of the 2014 and 2015 Biographical Assessments, it is clear
that they are significantly different.
6. Based on the FAAs responses to FOIA requests, including requests 2015008178 and 2016-000431, I calculated the pass and failure rates at the initial application
stage of the 2014 hiring announcement.
7. Based on the FAAs responses to FOIA requests, including requests 2015007021 and 2015-009349, I calculated the pass and failure rates at the initial application
stage of the 2015 hiring announcement.
8. While proceeding pro se in this matter, I corresponded with Defendants Counsel
Alarice M. Medrano, who indicated that the Agency remanded the appeal of the subject
FOIA request for action because the Agency had searched for 2014 records instead of
2015 records. Furthermore, it was made clear that the subject of the FOIA request was
the study proving that the testing instrument was valid.
9. I compiled the Adverse Impact Ratios using the FAAs FOIA responses to
requests for information concerning the demographics (race, ethnicity, and gender) for
the 2014 application for Air Traffic Control Specialists. Based on guidance from the
Equal Employment Opportunity Commission concerning the calculation of adverse
impact ratios, I found that adverse impact existed. I utilized FOIA responses from the
FAA, including 2015-008178 and 2016-000431 to compile the rates.
10. Exhibit 1 of Plaintiffs Response to Defendants Motion for Summary Judgment
(MSJ) is a true and correct copy of a letter dated December 8, 2015 from FAA
Administrator Michael Huerta to Kelly A. Ayotte, Chair of the Subcommittee on
Aviation Operations, Safety, and Security. This letter is available online. The letter
concerns the FAAs changes to the hiring process and the validation of the examination.
2
#:534
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
11. Exhibit 2 of the MSJ contains two documents, and is a true and correct copy of
the portion of the application process of the 1) 2014 and 2) 2015 application process
for Air Traffic Control Specialist applicants. The document was provided to me by an
individual(s) impacted by the FAAs changes to the hiring process. The answer choices
selected have been redacted. The 2014 examination is presented first and has the middle
pages removed. The 2015 examination has all pages except the first and last 2 removed
to protect the identity of those taking and/or providing me said documents.
12. Exhibit 3 of the MSJ is a true and correct copy of a transcript taken by a private
company concerning a Telephonic Conference concerning FAA Hiring Practices, held
on January 4, 2014. The document is available online.
13. Exhibit 4 of the MSJ is a true and correct copy of an email sent by Joseph
Teixeira, former Vice President for Safety & Technical Training for FAA. The email
concerns the revisions to the hiring process, including the use of the biographical
questionnaire. The email was sent to institutions a part of the AT-CTI program and the
email is widely available online. The email was sent on or about December 30, 2013.
14. Exhibit 5 of the MSJ is a true and correct copy of a portion of a conversation
held between an individual impacted by the changes to the hiring process and Matthew
Borten, an FAA representative, concerning the use of the biographical questionnaire.
This conversation took place during an FAA sanctioned and sponsored Virtual Career
Fair concerning the new hiring process for the 2014 cycle. The segment of the
conversation is available online.
15. Exhibit 6 of the MSJ is a true and correct copy of a presentation given by the
Federal Aviation Administration (FAA) to stakeholders affected or briefed on the
changes to the Air Traffic Control Specialist hiring process. The document, dated on or
about January 2015, was provided to me by a member institution of the Association of
Collegiate Training Institutions.
27
#:535
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
21
22
23
24
25
26
16. Exhibit 7 of the MSJ is a true and correct copy of an e-mail sent by the FAA to
individuals impacted by the changes to the FAA hiring process. The document, dated
on or about January 27, 2014, was provided to me by a member of the Association of
Collegiate Training Institution. Furthermore, the e-mail was widely distributed.
17. Exhibit 8 of the MSJ is a true and correct copy of the National Black Coalition
of Federal Aviation Employees (NBCFAE) google group, NBCFAEinfoWESTPAC,
and provides information concerning the hiring announcement. The post was written
from the account of James Swanson, an NBCFAE member. The Exhibit is widely
available online and was initially posted on or about January 24, 2014.
18. Exhibit 9 of the MSJ is a true and correct copy of the FAAs website prior to
issuing the off the street, open source vacancy announcement. The webpage
screenshot was widely distributed amongst the community of students impacted by the
changes.
19. Exhibit 10 of the MSJ is a true and correct copy of three documents available
from FAAs website concerning the validation study of the previous examination used
to test Air Traffic Control Specialist applicants. One document is titled,
Documentation of Validity for the AT-SAT Computerized Test Battery, Volume I,
and is dated March 2001. The second document is Volume II of the same study. The
third document is a March 2013 report available from the FAA website concerning
The Validity of the Air Traffic Selection and Training (AT-SAT) Test Battery in
Operational Use.
20. Exhibit 11 of the MSJ is a true and correct copy of an article published online by
Anna Burleson, dated March 5, 2014, concerning the changes to the ATCS hiring
program.
21. Exhibit 12 of the MSJ is a true and correct copy of the rejection notifications
received by applicants for the 2014 and 2015 vacancy announcements. The first page
27
#:536
1
2
3
4
5
6
7
8
9
11
12
10
13
14
15
16
17
18
19
20
of the exhibit was provided by an individual impacted by the changes, and the second
page is my own.
22. Exhibit 13 of the MSJ is a true and correct copy of the acknowledgement letter
for FOIA request 2015-006130. This letter was sent to me by the FAA.
23. Exhibit 14 of the MSJ is a true and correct copy of an email between Alarice M.
Medrano, Assistant U.S. Attorney with the Department of Justice, and myself. The
email concerns the Agencys revised search concerning the documents at issue.
24. Exhibit 15 of the MSJ is a true and correct copy of a memorandum issued by
FAA Chief Operating Officer Teri Bristol to FAA employees, on February 11, 2016,
concerning the validation of the AT-SAT examination. This memorandum is widely
available online.
25. Exhibit 16 of the MSJ is a true and correct copy of content available on the APT
Metrics website, including their main home page, a section titled Litigation Support,
and a copy of the presentation Testing the Test available on the APT Metrics website.
This information was retrieved on or about April 22, 2016.
26. Exhibit 17 of the MSJ is a true and correct copy of a letter sent by FAA
Administrator Huerta to Kyle Nagle, an individual interested in the changes to the
hiring process for Air Traffic Control Specialists. The letter dated October 8, 2014 is
available amongst the community of those impacted. This letter was in response to Mr.
Nagles letter to Vice President of the United States Joe Biden.
21
22
23
I swear or affirm under penalty of perjury under United States laws that my answers
on this form are true and correct. 28 U.S.C. sec. 1746; 18 U.S.C. sec. 1621.
24
25
26
27
///
5
#:537
Executed this 25th day of April, 2016.
Jorge Alejandro Rojas
4
5
6
Subscribed, sworn to and aclmowledged before me by Jorge Alejandro Rojas,

this 25th day of April, 2016.
Notary Public:
9
10
My Commission Expires:
11
~r
Q)
12
-+-'
0 -Ql r-0
0 !!?
~ U5 :g
""
Ql
~ > 0
~ .t::!
g .'.(
o:'.
Q)
co
~ :;:t _g
~
t
;:i
14
.. ~
(/J
13
ct1l
15
CL
16
17
18
19
20
21
22
23
24
25
26
27

2:15-CV-05811 Response To MSJ

Uploaded by

Copyright:

Available Formats

2:15-CV-05811 Response To MSJ

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2:15-CV-05811 Response To MSJ

Uploaded by

Copyright:

Available Formats

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 1 of 23 Page ID #:167

Michael W. Pearson, AZ SBN 016281

IN THE UNITED STATES DISTRICT COURT

FOR THE CENTRAL DISTRICT OF CALIFORNIA

Jorge Alejandro Rojas,

Federal Aviation Administration,

Case No. CV15-5811-CBM (SSx)

PLAINTIFFS RESPONSE TO DEFENDANTS MOTION FOR SUMMARY

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 2 of 23 Page ID #:168

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 3 of 23 Page ID #:169

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 4 of 23 Page ID #:170

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 5 of 23 Page ID #:171

Plaintiff Jorge Alejandro Rojas, through undersigned counsel, respectfully

MEMORANDULM OF POINTS AND AUTHORITIES

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 6 of 23 Page ID #:172

Curry, Pearson & Wooten, PLC

Including 42 U.S.C. 2000e-2(h)

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 7 of 23 Page ID #:173

Curry, Pearson & Wooten, PLC

FAAs intentional compromising of safety due to political correctness is the type of

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 8 of 23 Page ID #:174

not subject to exemption 5. Additionally, based on Defendants Vaughn index, it is clear

Biographical Assessment Validation Study (6130). On or about May 20, 2015,

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 9 of 23 Page ID #:175

Phoenix, Arizona 85007

Curry, Pearson & Wooten, PLC

Dispute of Material Facts and Inadequate Search

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 10 of 23 Page ID #:176

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 11 of 23 Page ID #:177

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 12 of 23 Page ID #:178

Assuming No Dispute of Material Facts, the FAA is Not Entitled to

Curry, Pearson & Wooten, PLC

The validation study and summary show no merit of being

Neither the validation study nor the summary of it is protected by work-product

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 13 of 23 Page ID #:179

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 14 of 23 Page ID #:180

Curry, Pearson & Wooten, PLC

discovered through a lawyers effort or is recorded only in otherwise protected work

There is a lack of litigation needed for the FAA to anticipate in relation

proceedings, as well as adversarial proceedings before an administrative agency, an

arbitration panel or a claims commission, and alternative-dispute-resolution

proceedings such as mediation or mini-trial. Restatement (Third) of the Law

Governing Lawyers 87 cmt. h (2000). In short, an adversarial rulemaking proceeding

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 15 of 23 Page ID #:181

Curry, Pearson & Wooten, PLC

The Study and the Summary were not prepared in anticipation of

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 16 of 23 Page ID #:182

Curry, Pearson & Wooten, PLC

Case 2:15-cv-05811-CBM-SS Document 26 Filed 04/25/16 Page 17 of 23 Page ID #:183

Phoenix, Arizona 85007