Does not systematically and consistently disadvantage one group of test takers over another group.

Which is TRUE about reliability in the psychometric sense?

A. reliability is an all-or-none measurement

B. a test may be reliable in one context and unreliable in another

C. a reliability coefficient may not be derived for personality tests

D. alternate forms reliability may not be derived for personality tests

B

In classical test theory, an observed score on an ability test is presumed to represent the testtaker's

A. true score.

B. true score less the variance.

C. true score combined with extraneous factors.

D. the testtaker's true score and error.

D

Which is TRUE of measurement error?

A. Like error in general, measurement error may be random or systematic.

B. Unlike error in general, measurement error may be random or systematic.

C. Measurement error is always random.

D. Measurement error is always systematic.

A

This variety of error has also been referred to as "noise." It is

A. systematic error.

B. random error.

C. measurement error.

D. background error.

B

The more homogeneous a test is, the

A. less inter-item consistency it can be expected to have.

B. more utility the test has for measuring multifaceted variables.

C. more inter-item consistency it can be expected to have.

D. None of these

C

A confidence interval is a range or band of test scores that

A. has proven test-retest reliability.

B. is calculated using the standard error of the difference.

C. is likely to contain the true score.

D. None of these

C

The standard error of measurement is

A. used to infer how far an observed score is from the true score.

B. also known as the standard error of a score.

C. is used in the context of classical test theory.

D. All of these

D

Reliability, in a broad statistical sense, is synonymous with

A. consistently good.

B. consistently bad.

C. consistency.

D. validity.

C

A reliability coefficient is

A. an index.

B. a proportion of the total variance attributed to true variance.

C. unaffected by a systematic source of error.

D. All of these

D

Which of the following is true of systematic error?

A. It significantly lowers the reliability of a measure.

B. It insignificantly lowers the reliability of a measure.

C. It increases the reliability of a measure.

D. It has no effect on the reliability of a measure.

D

As the degree of reliability increases, the proportion of

A. total variance attributed to true variance decreases.

B. total variance attributed to true variance increases.

C. total variance attributed to error variance increases.

D. None of these

B

A source of error variance may take the form of

A. item sampling.

B. testtakers' reactions to environment-related variables such as room temperature and lighting.

C. testtaker variables such as amount of sleep the night before a test, amount of anxiety, or drug effects.

D. All of the above

D

Which type of reliability estimate is obtained by correlating pairs of scores from the same person (or people) on two different administrations of the same test?

A. a parallel-forms estimate

B. a split-half estimate

C. a test-retest estimate

D. an au-pair estimate

C

Which type of reliability estimate would be appropriate only when evaluating the reliability of a test that measures a trait that is presumed to be relatively stable time?

A. parallel-forms

B. alternate-forms

C. test-retest

D. split-half

C

An estimate of test-retest reliability is often referred to as a coefficient of stability when the time interval between the test and retest is more than

A. 30 days.

B. 60 days.

C. 3 months.

D. 6 months.

D

Which of the following might lead to a decrease in test-retest reliability?

A. the passage of time between the two administrations of the test.

B. coaching designed to increase test scores between the two administrations of the test.

C. practice with similar test materials between the two administrations of the test.

D. All of these

D

Which of the following is TRUE for parallel forms of a test?

A. The means of the observed scores are equal for the two forms.

B. The variances of the estimated scores are equal for the two forms.

C. The means and variances of the observed scores are equal for the two forms.

D. The means and variances of the estimated scores are equal for the two forms.

C

Which source of error variance affects parallel- or alternate-form reliability estimates but does not affect test-retest estimates?

A. fatigue

B. learning

C. practice

D. item sampling

D

Which of the following types of reliability estimates is the most expensive due to the costs involved in test development?

A. test-retest

B. parallel-form

C. internal-consistency

D. Spearman's rho

B

What term refers to the degree of correlation between all the items on a scale?

A. inter-item homogeneity

B. inter-item consistency

C. inter-item heterogeneity

D. parallel-form reliability

B

Test-retest estimates of reliability are referred to as measures of ________, and split-half reliability estimates are referred to as measures of ________.

A. true scores; error scores

B. internal consistency; stability

C. interscorer reliability; consistency

D. stability; internal consistency

D

The Spearman-Brown formula is used for:

A. correcting for one half of the test by estimating the reliability of the whole test.

B. determining how many additional items are needed to increase reliability up to a certain level.

C. determining how many items can be eliminated without reducing reliability below a predetermined level.

D. All of these

D

For a heterogeneous test, measures of internal-consistency reliability will tend to be ________ compared with other methods of estimating reliability.

A. higher

B. lower

C. very similar or higher

D. more robust

B

Typically, adding items to a test will have what effect on the test's reliability?

A. Reliability will decrease.

B. Reliability will increase.

C. Reliability will stay the same.

D. Reliability will first increase and then decrease.

B

Error variance for measures of inter-item consistency comes from

A. fatigue.

B. motivation.

C. a testtaker practice effect.

D. heterogeneity of the content.

D

Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?

A. Randomly assign items to each half of the test.

B. Assign odd-numbered items to one half and even-numbered items to the other half of the test.

C. Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.

D. Assign easy items to one half of the test and difficult items to the other half of the test.

D

Which of the following is generally the preferred statistic for obtaining a measure of internal-consistency reliability?

A. KR-20

B. KR-21

C. Kendall's Tau

D. coefficient alpha

D

Which of the following is TRUE about coefficient alpha?

A. Kuder thought it to be single best measure of reliability.

B. It was first conceived by Alfalfa Alpha.

C. It is a characteristic of a particular set of scores, not of the test itself.

D. None of these

C

Which type(s) of reliability estimates would be most appropriate for a measure of heart rate?

A. test-retest

B. alternate-form

C. parallel form

D. internist consistency

A

Item response theory (IRT) focuses on the

A. circumstances that inspired the development of the test.

B. test administration variables.

C. individual items of a test.

D. "how and why" of the Interborough Rapid Transit line

C

The standard deviation of a theoretically normal distribution of test scores obtained by one person on equivalent tests is

A. the standard error of the difference between means.

B. the standard error of measurement.

C. the standard deviation of the reliability coefficient.

D. the variance.

B

As the reliability of a test increases, the standard error of measurement

A. increases.

B. decreases.

C. remains the same.

D. alternately increases, then decreases.

B

Which statistic can help the test user determine how large a difference must exist for scores yielded from two different tests to be considered statistically different?

A. standard error of measurement between two scores

B. standard error of the difference between two scores

C. observed variance minus error variance

D. standard error of the difference between two means

B

What is the difference between alternate forms and parallel forms of a test?

A. Alternate forms do not necessarily yield test scores with equal means and variances.

B. Alternate forms are designed to be equivalent only with regard to level of difficulty.

C. Alternate forms are different only with respect to how they are administered.

D. There are no differences between alternate and parallel forms of a test.

A

The index that allows a test user to compare two people's scores on a specific test to determine if the true scores are likely to be different is

A. the standard error of the mean.

B. the standard error of the difference.

C. the standard deviation.

D. the correlation coefficient.

B

Which type of reliability is directly affected by the heterogeneity of a test?

A. test-retest

B. interrater

C. internal-consistency

D. alternate-forms or parallel-forms

C

Generalizability theory is most closely related to

A. developing norms.

B. item analysis.

C. test reliability.

D. the way things are "in general."

C

Which of the following is TRUE of the standard error of measurement?

A. The larger the standard error of measurement, the better.

B. The standard error of measurement is inversely related to the standard deviation (that is, when one goes up, the other goes down).

C. The standard error of measurement is inversely related to reliability (that is, when one goes up, the other goes down).

D. A low standard error of measurement is indicative of low validity.

C

In the formula X = T + E, T refers to

A. the true score.

B. the time factor.

C. the average test score.

D. test-retest reliability.

A

The greater the proportion of the total variance attributed to true variance, the more ____________ the test.

A. scientific

B. variable

C. reliable

D. expensive

C

IRT is a term used to refer to

A. a model that has many parameters.

B. a parameter that has many models.

C. a family of models for data analysis.

D. a dysfunctional family of models.

C

As the term is applied to a test, validity is a judgment or estimate of how well a test

A. measures what it purports to measure.

B. measures what it purports to measure in a particular context.

C. satisfies the deductions that could logically be made from inferences about it.

D. All of these

B

Predictive and concurrent validity can be subsumed under

A. content validity.

B. criterion-related validity.

C. face validity.

D. true score validity.

B

Relating scores obtained on a test to other test scores or data from other assessment procedures is typically done in an effort to establish the __________ validity of a test.

A. content-related

B. criterion-related

C. face

D. about-face

B

Face validity refers to

A. the most preferred method for determining validity.

B. another name for content validity.

C. the appearance of relevancy of the test items.

D. validity determined by means of face-to-face interviews.

C

A standard against which a test or test score is evaluated is known as

A. a facet.

B. a correlation coefficient.

C. a validity coefficient.

D. a criterion.

D

The form of criterion-related validity that reflects the degree to which a test score is correlated with a criterion measure obtained at the same time that the test score was obtained is known as:

A. predictive validity.

B. construct validity.

C. concurrent validity.

D. content validity.

C

The form of criterion-related validity that reflects the degree to which a test score correlates with a criterion measure that was obtained some time subsequent to the test score is known as:

A. predictive validity.

B. construct validity.

C. concurrent validity.

D. content validity.

A

What type of validity evidence BEST sheds light on how a shorter and less expensive test compares with a longer and more expensive one?

A. predictive criterion-related validity

B. concurrent criterion-related validity

C. content validity

D. construct validity

B

Blueprinting is best associated with

A. construct validity.

B. content validity.

C. criterion-related validity.

D. architectural validity.

B

Criterion-related validity can be evaluated through the use of

A. expectancy data.

B. reliability coefficients.

C. the Rulon formula.

D. None of these

A

All validity evidence can be interpreted as ________ validity.

A. content

B. criterion-related

C. predictive

D. construct

D

Which term is used to refer to the tendency of a rater to evaluate ratees higher than they objectively deserve because of the rater's inability to discriminate between aspects of the ratee's behavior?

A. halo effect

B. random error

C. generosity error

D. severity error

A

If new predictors explain something about a predicted score that was not already explained by existing predictors, the new predictor might be praised for its

A. test-retest reliability.

B. incremental validity.

C. construct validity.

D. face validity.

B

In psychological testing and assessment, bias refers to

A. random variation in test performance attributable to covert prejudice on the part of the test developer.

B. systematic variation in test performance that is unrelated to the construct that the test is intended to be measured.

C. a test or testing practice that systematically favors the performance of one group of testtakers over another.

D. All of these

B

According to Piaget, one of the basic mental operations through which learning occurs is assimilation, defined as

A. organizing new information so that it fits with existing information.

B. changing existing information so that it fits with existing information.

C. changing new information so that it fits with existing information.

D. rejecting existing information that does not fit with new information.

A

One way of reducing the culture loading of a test is to

A. use oral instructions instead of printed instructions.

B. prohibit practice prior to a test.

C. have testtakers write answers on an answer sheet.

D. All of these

A

The Wechsler tests of intelligence

A. measure more than two factors.

B. measure only a Verbal and Performance factor.

C. are based on a factor-analytically derived, two-factor theory.

D. have been updated to reflect the VPR model.

A

The concepts of social intelligence, concrete intelligence, and abstract intelligence are collectively best associated with which theorist?

A. Thorndike

B. Carroll

C. Cattell

D. Horn

A

According to E. L. Thorndike, an individual's ability to learn is determined primarily by

A. consistency of the bonds that can be established.

B. speed of the bonds that can be established.

C. the number of bonds that can be established.

D. Both b and c

D

Galton's conception of intelligence focused on

A. sensory abilities.

B. environmental factors.

C. behavioral assets and deficits.

D. consistency of the bonds that can be established.

A

This was the test that launched the testing movement in the United States. It was, of course, the ______ test.

A. Wechsler-Bellevue

B. Stanford-Binet

C. Army-Alpha

D. Army-Navy

B

Binet believed that the primary purpose of an intelligence test was to assist the test user in the process of

A. classification.

B. measurement.

C. prediction.

D. data collection.

A

A major thread running through the theories of Binet, Wechsler, and Piaget is the concept of interactionism. In this context, interactionism refers to

A. interaction between mind and body.

B. members of different professions working together.

C. interaction between heredity and environment.

D. interaction between different psychological approaches to intelligence.

C

Who first hypothesized that the proportion of the variance that a number of tests have in common accounts for a general factor of intelligence?

A. Pearson

B. Piaget

C. Spearman

D. Galton

C

Crystallized intelligence includes

A. application of general knowledge.

B. nonverbal abilities.

C. sensory abilities.

D. All of these

A

In terms of their definitions of intelligence, the theories of Guilford, Thurstone, and Cattell, can all be grouped as

A. single-factor models.

B. information processing models.

C. factor analytic models.

D. runway models.

C

Spearman's g factor refers to

A. what different intelligence tests have in common.

B. the specific factors assessed by different intelligence tests.

C. the fact that Galton was Spearman's inspiration.

D. all that is genetic in the nature-nurture controversy.

A

In the Kaufman Assessment Battery for Children, subtests are organized by which of the following?

A. successive and simultaneous processing

B. general and specific abilities

C. reflective and impulsive processing

D. auditory and visual processing

A

The BEST measure of "intelligence" in infants could MOST probably be obtained by

A. reviewing hospital records.

B. structured interviews of the child.

C. assessment of sensorimotor skills.

D. behavioral observation of the child watching Sesame Street.

C

Children's intelligence is assessed primarily for

A. vocational placement and planning.

B. academic research and planning.

C. behavioral research and planning.

D. educational placement and planning.

D

Which test author developed a series of intelligence tests that has been characterized as spanning "cradle to grave"?

A. Thurstone

B. Wechsler

C. Sternberg

D. Piaget

B

Tests of intelligence are typically administered to older adults for all of the following purposes EXCEPT:

A. educational placement.

B. diagnosis of clinical disorders.

C. neuropsychological assessment.

D. vocational planning.

A

Preformationists would agree with which statement?

A. Intelligence can be attributed to socialization.

B. Intelligence can be increased by means of educational initiatives and early intervention programs.

C. Intelligence tends to be higher in Asians than in whites and blacks.

D. Intelligence is genetically encoded.

D

According to the interactionist viewpoint with regard to intelligence

A. intelligence is a latent, unidimensional construct.

B. most of the variance attributable to "intelligence" on intelligence tests has more to do with "social interaction."

C. people inherit a certain intellectual potential, and how much of that potential is realized is dependent in part on environmental influences.

D. heredity and environment interact with respect to a third variable that is presumed to be latent and unidimensional.

C

According to your text, intelligence as measured by intelligence tests is BEST explained with reference to

A. the nature side of the nature/nurture controversy.

B. the nurture side of the nature/nurture controversy.

C. the interactionist perspective.

D. the CHC model.

C

Which is a technique or method used to minimize cultural bias in tests?

A. minimized verbal instruction

B. use of teaching items

C. use of sample items

D. All of these

D

The Flynneffect is characterized by

A. an average rise in measured intelligence each year from the year a test was normed.

B. an average decline in measured intelligence each year from the year that a test was normed.

C. fluctuating scores in measured intelligence as a presumed function of different dominant sources of variance.

D. a radical change to a flamboyant and hedonistic lifestyle reminiscent of that of the 1940s movie star, Errol Flynn.

A

Spearman proposed a two-factor theory of intelligence consisting of which two factors?

A. general and specific intelligence

B. creative and fixed intelligence

C. long-term and short-term memory

D. learned and inherited intelligence

A

Based on longitudinal research with veterans of World War II, the most important determinant of cognitive ability as an older adult seems to be cognitive ability as

A. a preschooler.

B. an elementary-school-age child.

C. a middle-school-age child.

D. a young adult.

D

Which of the following is TRUE regarding the stability of intelligence?

A. Intelligence is generally stable from preschool through adulthood.

B. Intelligence is generally stable from childhood through adulthood.

C. Intelligence is generally stable through adulthood.

D. Intelligence is unstable from cradle to grave.

C

The concept of an alternate item was first introduced in

A. The Stanford-Binet Intelligence Scale.

B. The Wechsler-Bellevue Intelligence Scale.

C. The Wechsler Adult Intelligence Scale.

D. The Wechsler Intelligence Scale for Children.

A

A ratio IQ is calculated as follows:

A. mental age multiplied by chronological age, divided by 100.

B. chronological age divided by mental age, multiplied by 100.

C. mental age divided by chronological age, multiplied by 100.

D. deviation IQ divided by chronological age, multiplied by 100.

C

A measured SB5 IQ in the range of 55 to 69 would place the testtaker in this category:

A. low average

B. borderline impaired or delayed

C. mildly impaired or delayed

D. moderately impaired or delayed

C

On the WAIS-IV, the best single overall index of intellectual ability is the

A. Full Scale IQ

B. GAI

C. CPI

D. FSI

B

On the WISC-IV, an index designed to help the test user understand the way that the testtaker processes various kinds of information is

A. a process score.

B. a Performance IQ.

C. freedom from distractibility

D. None of these

A

As cited in the text, the WRIT is

A. a formal order to engage in, or refrain from engaging in, some action.

B. a short form intelligence test.

C. an essay test of intelligence.

D. a reading test that yields an estimate of intelligence.

B

An assumption in factor analysis is that

A. correlation does not necessarily imply causation.

B. a single latent trait exists.

C. things that co-occur tend to have a common cause.

D. the residual covariance between two items with a common factor partialled out is 0.

C

The test that is the most widely used multiple aptitude test in the United States today is

A. the GATB.

B. the ASVAB.

C. the AGCT.

D. the DAS.

B

The Stanford Binet: Fifth Edition differed from previous editions of this test in that it

A. was a scale that Lewis Terman did not want to be associated with.

B. yielded a grade-correlated mental age.

C. was a point scale.

D. was not based on any particular theory of intelligence.

C

Starting with moderately difficult test items and then giving easier or harder items, depending on the test-taker's performance, is termed

A. personalized testing.

B. culture-fair testing.

C. biased testing.

D. adaptive testing.

D

Routing tests on the Stanford-Binet-5 are used to

A. determine the optimum level of item difficulty to start the other tests.

B. determine which course of study is best suited to the child's aptitude.

C. provide reliable information when making grade retention decisions.

D. teach children how to perform each type of test item.

A

The Stanford-Binet-5 is based on which theory?

A. Guilford's Structure of Intellect.

B. Cattell-Horn-Carroll theory of intellectual abilities.

C. Gardner's Multiple Intelligences.

D. Thurstone's Primary Abilities.

B

On the Wechsler tests of intelligence, the Full Scale IQ has a mean of ________________ and a standard deviation of _______________.

A. 50; 10

B. 100; 15

C. 10; 3

D. 115; 15

B

On the Wechsler tests of intelligence, the subtest scores have a mean of ____________ and a standard deviation of __________________.

A. 100; 15

B. 100; 16

C. 50; 10

D. 10; 3

D

On the Wechsler adult and children's intelligence tests, which subtest is most closely related to the total test score; that is, overall intelligence?

A. the Vocabulary subtest

B. the Arithmetic subtest

C. the Block Design subtest

D. None of these

A

The WISC-IV is appropriate for

A. children ages 6-16.

B. adults.

C. preschoolers ages 4-6.

D. infants under 6 months of age.

A

The WPPSI-III is used to measure the intelligence of children from ages ________ through ________.

A. 2.00; 8.75.

B. 2.25; 6.50.

C. 2.50; 7.25.

D. 2.75; 6.99.

C

Which is TRUE about short forms of the Stanford-Binet and Wechsler Scales?

A. They should never be used.

B. They are almost as accurate as the standard forms.

C. They are less reliable than the standard forms.

D. They cannot be used with deaf children.

C

A primary use for group intelligence tests is

A. job placement.

B. diagnosis.

C. screening.

D. Both a and b

C

Which is a DISADVANTAGE of group intelligence tests?

A. The lessened role of the examiner may mean that the examiner has less effect on the examinee's score.

B. The examiner has less opportunity to gather behavioral observations of examinees during testing.

C. Examinees may become bored or distracted as there tends to be less examiner interaction during test administration.

D. All other things equal, the reliability of group intelligence tests tends to be significantly less than the reliability of individually-administered tests.

B

Convergentthinking refers to

A. an inductive reasoning process.

B. a deductive reasoning process.

C. a vertical thinking process.

D. a lateral thinking process.

B

Compared with individually administered intelligence tests, group intelligence tests

A. are more psychometrically sound.

B. have a higher degree of predictive validity.

C. have the advantage in terms of cost efficiency.

D. All of these

C

Short forms of intelligence tests:

A. can be more reliable than the long form of the same test

B. are recommended for screening purposes only.

C. may be short on items, but they are all "long" on validity.

D. All of these

B

Which of the following statements is TRUE of the Wechsler tests of intelligence?

A. If an examiner has attained mastery in administering one, it will not be all that must of a stretch to learn how to administer another.

B. All Wechsler tests cover at least some portion of adult age ranges.

C. Administration errors never occur due to their specific rules for administration.

D. None of the Wechsler tests are as reliable as the SB5

A

Practice items on the WAIS-IV are also referred to as

A. behavioral rehearsal items.

B. performance items.

C. acquisition items.

D. teaching items

D

The deviation IQ reflects a comparison of the performance of the individual with the performance of others

A. in the entire standardization sample.

B. in the same grade in the standardization sample.

C. of the same age in the standardization sample.

D. in the same grade and of the same age in the standardization sample.

C

Through 1972, all published editions of the Stanford-Binet tended to be criticized for

A. the lack of acceptable test-retest reliability.

B. deficiencies with regard to the standardization sample.

C. oversights with respect to scoring guidelines.

D. "bugs" in the software used for test interpretation.

B

On a test of intelligence teaching items are

A. designed to illustrate the task required.

B. assure the examinee that the examiner knows what he or she is doing.

C. not formally scored.

D. Both a and c

D

Contributing greatly to the success of the WPPSI was the fact that this test

A. represented a downward extension of the WISC.

B. was the test of choice with gifted preschoolers.

C. adequately represented minorities in the standardization sample.

D. contained many items reflecting the CHC theory of intelligence.

C

Which of the following tests employed by the Army during World War I was MOST likely to be "culture-fair"?

A. the Army Alpha Test

B. the Army Beta

C. the AGCT

D. the ASVAB

B

As re-authorized in 2004, the Individuals with Disabilities Education Act (IDEA) changed the way that specific learning disabilities would be diagnosed. The legislation required states to allow the use of a process based on the child's response to scientific, research-based

A. treatment.

B. intervention.

C. therapy.

D. instruction.

B

A simplified statement of the response to intervention (RtI) model is as follows:

A. 1) intervene, 2) respond to intervention, 3) remediate or intervene as needed

B. 1) remediate, 2) evaluate, 3) instruct and remediate as needed

C. 1) evaluate, 2) instruct, 3) evaluate and instruct as needed

D. 1) instruct, 2) evaluate, 3) remediate or instruct as needed

D

RtI is said to be multilevel in nature because there are at least three levels of intervention. As noted in the text, these three levels feature instruction administered

A. 1) by parents, 2) by teachers, and 3) tutors providing individualized instruction.

B. 1) by school officials, 2) by special educators, and 3) by individual tutors.

C. 1) in the regular classroom, 2) in small groups, and 3) individualized instruction.

D. 1) in a special education class, 2) individualized, and 3) at-home by parents.

C

In implementing RtI, schools may use interventions tailored to students' individual needs that are selected by a multidisciplinary team of school professionals. According to the text, this approach to RtI is referred to as

A. the problem-solving model.

B. the general intervention model.

C. the hybrid model.

D. None of these

A

An approach to discovering a student's learning potential based on a model of testing-intervention-retesting is referred to as

A. integrative assessment

B. dynamic assessment

C. authentic assessment

D. biopsychosocial assessment

B

An informant marks test items indicating how often a particular behavior occurs. This informant is MOST likely completing a

A. rating scale.

B. structured interview.

C. projective inventory.

D. structured settlement.

A

An informant marks items describing the presence or absence of specified behaviors, thoughts, events or circumstances. The informant is MOST likely completing a

A. rating scale.

B. structured interview.

C. checklist.

D. structured projective inventory.

C

Checklists and rating scales are ideally used primarily for

A. making a diagnosis.

B. designing an intervention.

C. screening purposes.

D. revising other checklists and rating scales.

C

Psychoeducational test batteries are designed to measure

A. ability and achievement.

B. scholastic aptitude.

C. adjustment and personality.

D. academic motivation.

A

The K-ABC was designed to measure

A. achievement.

B. intelligence.

C. both achievement and intelligence.

D. general aptitude.

C

Which of the following is a psychoeducational test battery?

A. Stanford-Binet 5

B. Woodcock-Johnson III

C. Hilton PTB

D. K-ABC

B

Whether a test is referred to as an achievement test or an aptitude test may depend on

A. the face validity of the test for a specific purpose.

B. whether it can be shown that aptitude plays no factor in an achievement test score.

C. whether it can be shown that achievement plays no factor in an aptitude test score.

D. the use to which the test is put.

D

On a psychoeducational test battery, an intra-individual comparison may be made to evaluate a testtaker's:

A. relative strengths and weaknesses.

B. performance relative to other testtakers of the same age.

C. performance relative to other testtakers at the same grade level.

D. social skills relative to others of the same age.

A

Measures of interest, attitudes, and study habits are all BEST characterized as measures of

A. perceptual motor ability.

B. academic achievement.

C. motivation.

D. creativity.

C

Academicians may argue about what achievement tests actually measure, but it seems reasonable to conclude that most such tests measure

A. accomplishment.

B. intelligence.

C. ability.

D. potential.

A

Which academic subject area is most assessed at the elementary school level?

A. reading

B. science

C. social studies

D. arithmetic

A

Aptitude tests are typically designed to assess

A. learning that has occurred at school or by other formal means of instruction.

B. innate learning potential.

C. informal learning.

D. a combination of b and c

D

To determine if a child is ready for kindergarten, a school psychologist might administer

A. the Metropolitan Readiness Test.

B. the CLEP.

C. the Iowa Tests of Basic Skills.

D. the Mooney Problem Checklist.

A

The KeyMath3 is used with testtakers in the age range of 3-years-old to

A. 12-years-old

B. 18-years-old

C. 21-years-old

D. None of these

C

Achievement batteries tend to be __________________ instruments, whereas measures of achievement in specific subject areas tend to be _________________ instruments.

A. criterion-referenced; norm-referenced

B. teacher-made; standardized

C. group-administered; individually administered

D. standardized; teacher-made

D

Tests of achievement typically measure

A. innate potential.

B. informal learning based on life experiences.

C. learning that has occurred in formal and structured settings.

D. a combination of innate potential and learning that has occurred through everyday life experiences.

C

What is a difference between achievement and aptitude tests?

A. Aptitude tests are more limited in scope than achievement tests.

B. Aptitude tests draw on a broader fund of knowledge than achievement tests.

C. Aptitude tests are not used to make predictions about future performance, whereas achievement tests are used for this purpose.

D. Aptitude tests require skills that are formally taught in school, and achievement tests require skills that are learned informally.

B

Which of these is best characterized as a diagnostic test?

A. the ACT

B. the Woodcock-Johnson Psycho-Educational Test Battery

C. the Woodcock Reading Mastery Test—Revised

D. the Kaufman Assessment Battery for Children

C

As used in schools, diagnostic tests are designed to

A. diagnose learning disabilities.

B. pinpoint the biological or medical basis for a specific learning problem.

C. pinpoint where a student is having difficulty in a particular academic skill.

D. determine which academic area a child is best suited to major in.

C

Perhaps the greatest value of preschool tests lies in their use for identifying children who are

A. in a very low range of functioning and in need of intervention.

B. in a very high range of functioning and in need of intervention.

C. Both a and b

D. None of these

A

A performance test typically

A. contains items that demand the test-taker to perform.

B. taps certain knowledge or skills expected of the test-taker.

C. requires that correct responses be pantomimed.

D. None of these

B

Authentic assessment is also known as

A. true-to-life assessment

B. reality-based assessment

C. performance-based assessment

D. real-life assessment

C

As compared to more traditional assessment procedures, the assessor conducting a dynamic assessment typically

A. exhibits "more personality."

B. is not neutral.

C. is also an educator.

D. All of these

B

According to the text, a revision of the Individuals with Disabilities Education Act specifically states that

A. an aptitude test must be valid for the purpose for which it used.

B. only a licensed psychologist may make a diagnosis of SLD.

C. no single measure may be used to diagnose SLDs.

D. no single measure may be used to diagnose STDs.

C

Psychologist John Holland was BEST known for

A. a research program that led to the development of the Big 5.

B. a controversy regarding the primacy of states versus traits.

C. a theory that occupational choice depends on personality.

D. a tunnel linking New York and New Jersey, named in his honor.

C

A key definitional difference between the terms personality trait and personality state has to do with

A. the extent to which personality is viewed as a physical attribute.

B. how stable over time the characteristic is exhibited.

C. whether or not the characteristic is subject to a halo effect.

D. the motivation of the person or persons doing the rating.

B

In the title of the test known as the MMPI-A, the "A" stands for

A. Adult

B. Adolescent

C. Form "A"

D. Alternate Form

B

An implication of terms such as self-concept differentiation is that

A. personality traits can be fragmented.

B. the more unified one's self-perception, the better.

C. traits are context dependent.

D. None of these

C

An advantage of a personality test that is atheoretical in nature is that

A. test users can interpret the test according to their own theoretical preferences.

B. the test will, in all likelihood, be more valid than a theory-based test.

C. the test is much more likely to be "culture-specific" as well.

D. the test is much more likely to be "culture-fair" as well.

A

Which BEST describes what is typically measured in personality assessment?

A. social and communication skills

B. traits and states

C. charisma

D. creativity and motivation

B

The Self-Directed Search Tests are based on six personality

A. moods.

B. states.

C. types.

D. dispositions.

C

Personality tests are used for

A. evaluating influences on health.

B. planning psychotherapeutic interventions.

C. evaluating influences on academic performance.

D. All of these

D

An example of a personality test that employed empirical criterion keying in its development is the

A. 16 PF.

B. NEO-PI-R.

C. Rorschach.

D. MMPI.

D

The MOST frequently used and MOST researched personality test is the

A. Mooney Problem Checklist.

B. Rorschach.

C. 16 PF.

D. MMPI-2.

D

Criticisms of the MMPI-2 include all of the following EXCEPT:

A. there is questionable comparability of the MMPI and MMPI-2 scores.

B. there was a lack of minority group members in the standardization sample.

C. there was a lack of validity of the content scales.

D. the utility of the instrument has been repeatedly questioned.

B

The MMPI-2 is appropriate for which age range?

A. 6 to 60

B. 14 and older

C. 18 and older

D. 21 to 60

C

Which BEST describes a personality type?

A. a distinguishable behavior of an individual

B. context-specific behavior

C. an identified pattern of traits and states

D. an inherited characteristic of a person

C

Validity scales assist the test user in making judgments related to whether or not a testtaker:

A. lied when responding to a particular set of items.

B. understood the items.

C. misunderstood the items.

D. All of these

D

Which of the following represents the BEST alternative to the use of validity scales?

A. test directions that stress the importance of being honest when responding

B. including duplicate items throughout the test

C. using multiple sources of information to draw conclusions including third-party information.

D. having testtakers review their responses and then sign a statement attesting to the truthfulness of the responses.

C

Factor analysis is a statistical procedure designed to identify

A. the minimum number of variables accounting for the intercorrelations between the phenomena.

B. the maximum number of variables accounting for the intercorrelations between the phenomena.

C. the median number of variables accounting for the intercorrelations between the phenomena.

D. all potential variables accounting for the intercorrelations between the phenomena.

A

The 16 PF is a personality measure that assesses 16 different personality

A. states.

B. traits.

C. types.

D. defects.

B

Self-report measures have been criticized because

A. respondents can present themselves in a way that meets their needs.

B. Results are less reliable than other measures.

C. Results are less valid than reports from others.

D. Self-report measures usually require a great deal of explanation.

A

Which of the following BEST sums up the reason why the MMPI-2-RF was developed?

A. The MMPI-2 had too many overlapping items.

B. There were too many test items on the MMPI-2

C. The MMPI-2 validity scales lacked validity.

D. All of these

A

"A successful life" is an example of

A. an instrumental value.

B. a terminal value.

C. a normative value.

D. vocational value.

B

A data reduction method used in personality test development is

A. regression analysis.

B. factor analysis.

C. final analysis.

D. shrink analysis.

B

The MMPI-A may be criticized for its lack of

A. minority group membership in the standardization sample.

B. a contemporary and representative standardization sample.

C. a contemporary clinical sample.

D. a geographically representative clinical sample.

D

The validity scales of the MMPI, MMPI-2, and MMPI-2-RF

A. have all been heavily criticized as self-serving since they presume to measure the test's own validity.

B. measure only the validity of the test but do not measure the test's reliability or utility.

C. provide a gauge of testtaker response sets and patterns.

D. All of these

C

Objective tests of personality have in common with many objective tests of ability and achievement the fact that test items.

A. typically contain only one correct answer.

B. can be scored by machine, computer or online.

C. have a sound grounding in theory.

D. All of the above

B

As concluded in Chapter 13, the semantic distinction between objective and projective tests

A. is meaningful, especially as it relates to the content of the items.

B. should be abandoned because there is really no difference between them.

C. is only meaningful to the extent that each relies heavily on self-report.

D. has meaning insofar as the mechanisms typically used for scoring.

D

Behavioral assessment has many advantages over other forms of assessment. Which is NOT one of those advantages?

A. Behavioral assessment can provide adequate explanations for apparently contradictory dynamics in motivation.

B. Behavioral assessment can provide behavioral baseline data.

C. Behavioral assessment can provide a record of the assessee's behavioral strengths and weaknesses across a variety of situations.

D. Behavioral assessment can be used to pinpoint environmental conditions that are acting to trigger, maintain, or extinguish certain behaviors.

A

Projective tests are BEST characterized as _____ methods of personality assessment.

A. indirect

B. futile

C. objective

D. behavioral

A

In recent years, projective tests have

A. increasingly been criticized for being culturally loaded.

B. increasingly become norm-referenced.

C. attempted to tap less and less unconscious content.

D. None of these

B

The assumption that individuals provide structure to unstructured stimuli in a manner consistent with their individual needs, conflicts, and impulses is known as

A. the psychoanalytic concept of repression.

B. the stimulus-response hypothesis.

C. the defense mechanism of denial.

D. the projective hypothesis.

D

The use of projective tests minimizes

A. an assessee's attempt to fake good or fake bad.

B. the necessity for examinees to have a working knowledge of English.

C. cross-cultural bias.

D. the need for payment by third parties such as insurers.

A

Of the following projective tests, which is the MOST widely used?

A. the Draw-A-Person Test

B. the Holtzman Inkblot Test

C. the Thematic Apperception Test

D. the Rorschach Inkblot Test

D

Of the following projective picture-story telling tests, which is the MOST widely used?

A. The Thematic Apperception Test.

B. The Make-a-Picture Story Method.

C. The Children's Apperception Test.

D. The Roberts Apperception Test

A

Word association tests such as that developed by Jung

A. are usually based on cognitive theories of personality.

B. are based on the premise that certain key words represent areas of conflict.

C. utilize only "traumatic" stimulus words, in an attempt to diagnose associative disturbances.

D. employ normative databases with samples matched to U.S. census data.

B

Sentence completion tests

A. are based predominantly on cognitive theories of personality.

B. usually have a high degree of face validity.

C. are not vulnerable to "faking."

D. All of these

B

The strongest psychometric aspect of the Rorschach is its

A. inter-rater reliability with respect to interpretations.

B. inter-rater reliability with respect to scoring categories.

C. test-retest reliability over a short period of time.

D. internal-consistency split-half reliability for odd and even items.

B

Objective methods of personality assessment may include which of the following item formats?

A. multiple-choice

B. true-false

C. matching

D. All of these

D

Which of the following is MOST appropriate for determining the psychometric soundness of behavioral assessment?

A. classical test theory

B. generalizability theory

C. empirical methods

D. the experimental analysis of behavior

B

The type of research that attempts to replicate a real-world problem in a research or clinical setting is called

A. unobtrusive research.

B. case history research.

C. sign research.

D. analogue research.

D

Situational performance measures make it difficult to

A. fake bad.

B. fake good.

C. fake good or bad.

D. get a job.

B

The authors of the DSM-V addressed the criticism that its predecessor, DSM-IV

A. was not sufficiently biopsychosocial in orientation.

B. had insufficient validity studies to support its use.

C. had failed to include hypocondriasis as a disorder.

D. All of these

A

An interview procedure similar to a hypnotic interview but without the hypnotic induction is

A. the mental status examination.

B. the stress interview.

C. the cognitive interview.

D. None of these

C

What is commonly referred to as profiling is referred to by the FBI as

A. criminal psychological analysis.

B. criminal psychoanalysis.

C. criminal investigative analysis.

D. investigative psychological analysis.

C

At the core of profiling is the assumption that perpetrators of serial crimes leave at their crime scenes

A. physical evidence.

B. psychological evidence.

C. philosophical evidence.

D. business cards.

B

The biopsychosocial approach to assessment could BEST be described as

A. multicultural.

B. multidisciplinary.

C. psychoanalytic.

D. All of these

B

Which measures are typically included as part of a standard battery?

A. measures of intelligence and personality

B. measures of intelligence, personality, and neurological intactness

C. measures of intelligence, personality, neurological intactness, and case history data

D. a measure of intelligence, a measure of neurological intactness, and the Mooney Problem Checklist

B

A problem with self-report measures of depression, such as the Beck Depression Inventory, is that

A. testtakers can fake depression or fake the absence of depression.

B. they are highly unreliable and based solely on what happened to the testtaker during the previous day or so.

C. they do not gauge the severity of depression, only its presence or absence.

D. the theory on which they are based has been shown to be severely flawed.

A

Forensic psychological assessment entails evaluation in matters involving

A. whether a parent should retain custody of a child.

B. whether an individual is competent to stand trial.

C. whether an individual is guilty by reason of insanity.

D. All of these

D

Which of the following tools of psychological assessment is MOST likely to be used in order to evaluate a patient's level of premorbid functioning?

A. role-play

B. behavioral observation

C. the case history

D. All of these

C

Which tools of psychological assessment are used to assess child abuse and neglect?

A. psychological tests

B. interviews

C. anatomically correct dolls

D. All of these

D

The Barnum effect in psychological report writing refers to

A. very technical jargon that is too difficult for lay readers to understand

B. conflicting statements about the assessee within the same report.

C. statements that are by their very nature tend to be prejudicial.

D. vague, general statements that could be applied to most people.

D

Which of the following tools of assessment is MOST frequently used by clinicians in their daily work?

A. an interview

B. a rating scale

C. a picture-story projective test

D. reports from others

A

A psychological report is BEST written with

A. the diagnosis in mind.

B. the goal of the assessment in mind.

C. the tools of assessment that were employed at hand.

D. the best interest of the referral source in mind.

B

Of the following tests, which one is the MOST commonly used in evaluating substance abuse?

A. MacAndrew Alcohol Scale-Revised

B. The Addiction Potential Scale

C. Addiction Severity Index

D. Addiction Acknowledgment Scale

C

Which is NOT a subspecialty of neuropsychology?

A. forensic neuropsychology.

B. school neuropsychology.

C. vocational neuropsychology.

D. geriatric neuropsychology.

C

Symptoms of a neurological deficit may become evident

A. during an intake interview.

B. during the administration of psychological tests.

C. as the result of the administration of a psychological test.

D. All of these

D

A neuropsychological examination by an experienced neuropsychologist

A. varies widely as a function of the referral question.

B. consists of a standard battery of tests.

C. is usually initiated only if "hard signs" are found in screening.

D. is usually initiated only if "hard signs" are suspected.

A

A typical neuropsychological evaluation includes

A. a medical history of the patient.

B. a psychosocial history of the patient.

C. a history regarding developmental milestones.

D. All of these

D

The Tower of Hanoi is used

A. as a measure of various aspects of executive functioning.

B. as a measure of planning ability.

C. as a measure of cognitive flexibility.

D. All of these

D

Patients suffering from aphasia exhibit a loss of the ability to

A. perceive smells.

B. perceive sounds lower in volume than a "dollar watch."

C. express themselves orally or in writing.

D. hold their hands steady.

C

The Bender-Visual Motor Gestalt Test

A. requires a considerable amount of time and special equipment to administer.

B. relies solely on qualitative clinical judgment because no normative data are available.

C. was re-standardized using year 2000 census data.

D. was designed to assess perceptual maturation and neurological impairment.

D

Which is TRUE of a flexible battery in neuropsychological assessment?

A. A flexible battery of tests is typically preferable to a standard or prepackaged battery because it is more tailored to the assessee's presenting problem.

B. Tests administered in a flexible battery frequently overlap with some of the functions typically assessed in a standard battery, although both should always be administered.

C. All tests in a flexible battery have usually been normed on similar populations.

D. Tests administered in a flexible battery are less time-consuming than those in a standard battery.

A

Most neuropsychologists prefer which of the following when conducting a neuropsychological evaluation?

A. a prepackaged neuropsychological battery

B. a flexible neuropsychological battery

C. the Halstead-Reitan

D. the Luria-Nebraska

B

Which of the following is TRUE of a fixed, prepackaged neuropsychological test battery?

A. All of the tests in the battery are standardized.

B. All of the tests were developed based on one particular battery.

C. The tests can be administered in a flexible fashion.

D. Each test in the battery is designed to measure only one neuropsychological variable.

A

The Luria-Nebraska is

A. a personality assessment battery.

B. a flexible neuropsychological test battery.

C. a fixed neuropsychological test battery.

D. a patient expense for testing reimbursable from Mutual of Omaha.

C

The Halstead-Reitan Neuropsychological Test Battery

A. is relatively easy to administer and can be administered by paraprofessionals.

B. is useful in assessing perceptual, cognitive, and motor deficits.

C. can be administered in under two hours by a well-trained and experienced examiner.

D. Both b and c

B

The study of neuropsychology is BEST characterized as the study of

A. nervous system/cognition relationships.

B. brain/behavior relationships.

C. stimulus/response relationships.

D. physical/platonic relationships.

B

A self-report rating scale of neurological impairment is

A. the Patient's Assessment of Own Functioning Scale.

B. the Short Portable Mental Status Questionnaire.

C. the Neuropsychological Impairment Scale.

D. the Seashore Rating Scale.

C

The Controlled Word Association Test is used primarily to

A. evaluate pre-senile dementia.

B. measure cognitive functioning.

C. assess verbal fluency.

D. diagnose Alzheimer's disease.

C

The Symbol Search subtest of the WISC-III is an example of a

A. verbal memory test.

B. perceptual-motor test.

C. test of abstract reasoning.

D. test of fine motor skills.

B

The Halstead-Reitan Neuropsychological Battery is an example of

A. a flexible battery.

B. a routing battery.

C. an atheoretical battery.

D. a lithium battery.

C

The Wechsler Memory Scale-III taps primarily which type of memory?

A. procedural

B. semantic

C. declarative episodic

D. non-declarative procedural

C

Neuropsychological deficits in organizing and planning can BEST be assessed by which of the following?

A. the Wechsler Similarities Test

B. a proverb interpretation test

C. the Porteus Maze Test

D. an object sorting test

C

Why are Wechsler tests so popular with neuropsychologists for use in neuropsychological evaluations?

A. the varied nature of the tasks on these tests make them appealing

B. the culture-fairness of the materials make them appropriate for administration to a wide variety of people.

C. most of these tests feature special norms for neurologically impaired populations.

D. All of these

A

Neurotology is

A. a branch of medicine that focuses on hearing and balance issues and problems.

B. a branch of medicine that focuses on toxins that can affect neurological functioning.

C. a branch of neuroscience that involves the study of law and ethics as it relates to neurology and neuropsychology.

D. a misspelling of neurology

A

"RIASEC" is an abbreviation for

A. the Big 6.

B. a popular integrity measure.

C. Dr. Chris Gee's consulting firm.

D. a type of corporate culture.

A

A difference between selection and classification is that in classification, personnel are

A. accepted or rejected.

B. pigeon-holed.

C. subjected to screening.

D. stereotyped.

B

Burnout has been shown to be

A. predictive of the frequency and duration of sick leave.

B. significantly evident in postal workers on the job for ten years or more.

C. negatively correlated with insomnia.

D. All of the above

A

Realistic, Investigative, Artistic, Social, Enterprising, and Conventional are categories of personality types that are measured by the

A. NEO-PI-R.

B. Myers-Briggs Type Indicator.

C. Maslow Hierarchy of Needs Inventory.

D. Self-Directed Search.

D

From the perspective of an employer, psychological assessment tools can provide assistance in

A. personnel selection.

B. placement of employees in job assignments.

C. screening of new applicants for positions.

D. All of these

D

Scores on the Strong Interest Inventory provide information about a testtaker's interests relative to

A. the skills required for a particular job.

B. the educational level required for a particular job.

C. the probability of success at a particular job based on academic achievement.

D. the interests of people actually employed in various occupations.

D

The General Aptitude Test Battery (GATB)

A. is administered only by licensed counseling psychologists.

B. was developed by the military to assist veterans in their job search.

C. taps cognitive, perceptual, and psychomotor skills.

D. Both a and b

C

The process of selection based on an evaluation of certain minimal criteria is referred to as

A. classification.

B. directed search.

C. screening.

D. interviewing.

C

The term that BEST applies to accept or reject with regard to the hiring of new personnel is

A. selection.

B. placement.

C. classification.

D. directed search.

A

A "real time, live action" approach to assessment that requires assessees to demonstrate abilities that typically are characteristic of those they might encounter on-the-job is referred to as:

A. portfolio assessment.

B. performance assessment.

C. curriculum-based assessment.

D. authentic assessment.

B

Performance assessments

A. provide an opportunity for potential employees to candidly discuss their strengths and weaknesses.

B. involve a thorough background check with previous employers.

C. provide a job-related sample of behavior.

D. are usually not standardized.

C

The Strong Interest Inventory measures

A. vocational aptitudes.

B. vocational interests.

C. vocational interests and aptitudes.

D. vocational interests and general mental ability.

B

The Myers-Briggs Type Indicator (MBTI) was based on the theoretical writings of

A. Sigmund Freud.

B. B. F. Skinner.

C. Carl Jung.

D. Holland Opus

C

The Work Preference Inventory (WPI) is designed to assess

A. intrinsic and extrinsic motivation.

B. ability to work with people and ability to work alone.

C. interest in intellectually demanding work as opposed to routine, non-demanding work.

D. active managerial style as distinct from passive managerial style.

A

Résumés and application forms are BEST classified as ________ instruments.

A. placement

B. screening

C. hiring

D. classification

B

According to your textbook, the typical objective of a performance test is to

A. compare an employee's performance with that of other employees.

B. assess the speed of an employee in completing a job-related task.

C. determine the accuracy of an employee's performance under stressful conditions.

D. obtain a job-related performance sample.

D

An assessment center is

A. a specific place in an organization staffed by the industrial/organizational psychologist.

B. a standardized procedure involving multiple assessment techniques.

C. a method of assessment used for selection, classification, and promotion.

D. Both b and c

D

According to your text, the Strong Interest Inventory was developed primarily on the assumption that

A. if an assessee has many outside interests, he or she will experience less on-the-job stress.

B. if an assessee's interests match those of people already in the field, the assessee may be well suited for the job.

C. all people in a particular field will work well together if they share similar interests.

D. the more interests an assessee has, the more adaptable he or she will be to various job-related situations.

B

Rating on the basis of one, sole criterion is referred to as

A. classification.

B. placement.

C. screening.

D. selection.

C

An aptitude test that includes both psychomotor and paper-and-pencil tasks is the

A. Bennet Mechanical Comprehension Test.

B. General Aptitude Test Battery.

C. Minnesota Clerical Test.

D. O'Connor Tweezer Dexterity Test.

B

At what age are students' interests thought to be sufficiently formed to be useful in course and career planning?

A. 10 years old

B. 15 years old

C. 7 years old

D. 13 years old

B

A major advantage of individual interviews with consumers is that they

A. have potential for verbal as well as nonverbal information.

B. are inexpensive.

C. can be conducted quickly.

D. are usually completed within five minutes or less.

A

The test we now refer to as the "Strong Interest Inventory" was, in its original edition, known as the

A. Strong Volitional Interest Bank.

B. Strong Vocational Interest Blank.

C. Strong Vocational Interest Inventory.

D. Bowflex Interest Inventory.

B

What does bias of a test mean?

A biased test is generally understood to be a test that. produces results that are systematically unfair to some group. For this to happen, the test must ordinarily measure variables. for that group at least partly distinct from those it measures. for other people in the population.

What makes a test biased?

A test is considered biased when the scores of one group are significantly different and have higher predictive validity, which is the extent to which a score on an assessment predicts future performance, than another group. Most test biases are considered cultural bias.

What is an example of cultural bias in testing?

Test Bias Examples A real-life example of cultural bias in testing is an analogy question that appeared on an SAT test in the 1980s asking students to find the comparable analogy to ''Runner: Marathon. '' The answer was ''Oarsman: Regatta'' which would be more familiar to students from white, affluent families.

What is test bias and test fairness?

Bias is a component of fairness—if a test is statistically biased, it is not possible for the testing process to be fair. However, a testing process can still be unfair even if there is no statistical bias present.