What is Applicant Faking Behavior? A Review on the Current State of Theory and Modeling Techniques

Socially desirable responding has become a central concept in research regarding applicants’ deliberate distortions in answering personality tests. Key findings based on this concept infer that such distortions do not impair the utility of personality measures for personnel selection. However, in recent years some scholars questioned the utility of social desirability itself as concept and introduced another applicant faking behavior. This narrative review provides an overview and integration of this construct by proposing a definition of applicant faking behavior, the current state of faking theory, and latest techniques to model and detect applicant faking. Implications of these findings for research and personnel selection are discussed.

Findings from research on SDR suggest that faking has little to no effect on the utility of personality tests in personnel selection. This is mainly because meta-analyses have shown that criterion-validity (e.g., for job performance) is not affected through SDR and correction for SDR does not improve criterion-validity (Barrick & Mount, 1996;Ones, Viswesvaran, & Reiss, 1996). Thus, reviews on the utility and usage of personality measures tend to conclude that faking (conceptualized as SDR) is not a problem at all (e.g., Hülsheger & Maier, 2008).
However, there is disagreement about what is faking. There has been debate on the conceptual nature of applicant faking and some researchers have attempted to establish applicant faking behavior as a construct of its own Kuncel & Bornemann, 2007), distinct from socially desirable responding. Nonetheless, in recent research the terms socially desirable responding (e.g., Ziegler & Buehner, 2009), impression management (e.g., Fan et al., 2012) or self-presentation (e.g., Marcus, 2009) are still used synonymously for applicant faking, implying that all these are accounting for the same set of behaviors.
In the debate over the nature of faking, the question of how to measure it must also be taken into account. In the early 1990s, faking research was based on faking-good studies in between-subject designs, in other words, an experimental group was instructed to fake as much as possible and then the means of the personality scales were compared to an honest control group (e.g., Cowles, Darling, & Skanes, 1992). Since then, studies with real applicants in within-subject designs have increased, i.e., applicants have been tested during personnel selection processes and then again under normal circumstances (e.g., Griffith et al., 2007). However, these methods have been criticized, because (a) measurement error and measurement nonequivalence also contribute to test score deviations (Stark, Chernyshenko, Chan, Lee, & Drasgow, 2001), and (b) they rely on the assumption that faking leads to maximum test scores, which has been challenged (e.g., Ziegler, 2011). Socially desirable responding scales have also been criticized for accusing honest respondents of distorting their responses (Tett & Christiansen, 2007).
There are different concepts of applicant faking behavior (AFB) which have not been condensed into an integrated approach yet. Thus, our goal was to gain insight into the current state of theories and models of applicant faking and how these can be integrated to a construct of AFB that is conceptually distinct from socially desirable responding. We also addressed how AFB could be measured adequately, which methodological and statistical procedures are state-of-the-art and in line with theoretical considerations of faking, and which of them reduce the risk of classifying honest respondents as having faked.

Literature Search
A three-step literature search was conducted to identify empirical studies and theoretical work on applicant faking behavior published between 1990 and 2015. First, a computer-based literature search in PsycINFO and Psy-cARTICLES databases was done. In order to identify theoretical contributions, we used the following keywords: faking, applicant faking, socially desirable responding, personality measures, personnel selection, theory, and model. For contributions on modeling techniques, we searched for faking, applicant faking, socially desirable responding, personality measures, personnel selection, modeling, and method. Second, a manual search was conducted that consisted of checking the sources cited in the reference sections from previously gathered articles. Third, the following journals were examined: International Journal of Selection and Assessment, Journal of Personnel Psychology, Human Performance, Personnel Psychology, and Journal of Applied Psychology. If found contributions were not published (yet), authors were contacted by e-mail, requesting their assistance in sending their papers or PhD theses.

Criteria for Inclusion
Whether a contribution was included in the study depended on two main criteria. First, the contribution referred to faking in an applicant setting or is at least applicable in these situations (i.e., methodological considerations). Therefore, research in different fields (i.e., clinical research) was not included in this review. Second, the contribution dealt with faking on general personality tests, such as the Big Five or related self-reported measures. Thus, studies examining faking in integrity or honesty tests were excluded as well as research on faking in interviews, in this way 40 empirical studies and theoretical contributions were included and analyzed for this review.

Findings What is Faking? The Gap Between Theory and Practice
First of all, we propose a definition of AFB that derives from our findings on the current state of faking theories and research: Applicant faking behavior is a collective term for all behaviors during selection procedures that contain intentional responses to a self-reported personality measure which do not correspond to the true self-image. This definition of AFB accounts for several important findings. First, faking is described as a behavior and not as a trait according to previous definitions (e.g., MacCann, . Second, the behavior is described as being intentional, which we do not suggest to be the same as conscious or rational (according to Griffith, Lee, Peterson, & Zickar, 2011), whereas it is distinct from random errors and other unsystematic deviations. Third, regarding correspondence to the true self-image the definition makes no assumption on the direction of the deviation. Usually, faking applicants try to give a more favorable impression, but some contributions suggest that faking behavior can also end up in worse scores than the true self-image would imply (Griffith et al., 2007;. Finally, the scope of AFB is limited to selection procedures and self-reported personality tests. This separates it from faking concepts in interviews or in clinical settings. How is faking different from social desirability? The distinction of faking from socially desirable responding is an issue of the traditional construct and measurement of SDR. Social desirability is closely related to SDR scales (Tett & Christiansen, 2007) and these scales (the first one published by Crowne and Marlowe in 1960) usually include items an honest person could not completely agree with (for example, "I have never lied before."). Social desirability has been rejected as a conceptualization of faking for two main reasons. First, SDR cannot be measured reliably because SDR scales themselves are susceptible to faking (Viswesvaran & Ones, 1999). Second, measures of faking (e.g., between-subject mean difference) correlate poorly with SDR. For example, Tett and Christiansen (2007) stated that only about 10% of desirability variance can be attributed to faking. In summary, AFB might in part present socially desirable behaviors (in a semantic sense) but refrains from traditional concepts of SDR and can therefore be considered as distinct construct.

The current state of faking theory
Motivation and ability to fake are the central influences on actual faking behavior (Goffin & Boyd, 2009;McFarland & Ryan, 2006;Snell, Sydell, & Lueke, 1999;Tett & Simonet, 2011). Simply put, to fake, one must be both willing and able to do so. Theoretical frameworks suggest that an applicant's need for a new job (Goffin & Boyd, 2009;Marcus, 2009) as well as their belief that faking is necessary to get the job Ellingson & McFarland, 2011) are strong motivators to fake. Motivation can also be derived from self-monitoring and self-efficacy ( Ellingson & McFarland, 2011;Goffin & Boyd, 2009;Marcus, 2009), inasmuch as applicants are aware of how they present themselves and how confident they feel in controlling this presentation. General mental ability (Ellingson & McFarland, 2011;Goffin & Boyd, 2009;Marcus, 2009) as well as emotional intelligence or empathy (Ellingson & McFarland, 2011;Marcus, 2009) both have been shown to play a role in ability to fake. In addition, the ability to identify the criteria a test assesses (König, Melchers, Richter, & Klehe, 2006;Marcus, 2009) requires knowledge on how personality tests are constructed as well as the skill to identify traits which are important for the job. However, high (cognitive) ability on its own will not make someone fake, unless they are motivated to fake. Conversely, being motivated to fake only leads to faking if an applicant knows how to fake. Therefore, it is important to note that neither motivation to fake nor ability to fake will results in faking behavior by itself, but only if both are combined.
Two subtypes of applicant faking behavior emerge from empirical and theoretical work Ziegler, Maaß, Griffith, & Gammon, 2015;Zickar, Gibby, & Robie, 2004), usually labeled slight faking and extreme faking behavior (Robie, Brown, & Beaty, 2007;Zickar & Robie, 1999). One main difference between slight and extreme faking is how applicants position themselves between their true self-image and the perceived ideal applicant profile. Quantitative (Zickar et al., 2004) and qualitative research (König, Merz, & Trauffer, 2012;Robie et al., 2007) has identified different response patterns and faking strategies. While applicants who fake slightly tend to make a compromise still reflecting their true self, applicants who fake extremely try to meet the ideal profile (Robie et al., 2007). Griffith and colleagues (2011) proposed four distinct types of faking, adding applicants who exaggerate their self-image without orientating themselves by an ideal schema and impulsive applicants who orientate themselves neither by their self-image nor by an ideal profile, although the latter two types of faking have not yet been detected empirically.
Third, as stated by Kuncel, Goldberg, and Kiger (2011), "Most test takers don't think like psychometricians" (p. 374) which is why even extreme faking does not necessarily lead to maximum test scores. The process of answering to a single item was investigated using a cognitive process model (Ziegler, 2011). This approach suggests that choosing neutral responses (i.e., the middle category) is central in faking strategies. This occurs when applicants assess an item to be unimportant for the job and try to keep information private or try to avoid a wrong answer (Ziegler, 2011). Others think that extreme responses on certain consciousness items (e.g., "I stick to my chosen path.") could be interpreted as obstinacy . Thus, applicants who fake consider an optimal test result to be the ideal applicant's profile, rather than the maximum score. Figure 1 gives an overview of the current state of applicant faking theories, integrating Ziegler's (2011) cognitive process model and the model of ability and motivation as described above (Goffin & Boyd, 2009).

How Can Faking Be Detected? In Quest of a Faking Fingerprint
Identifying and modeling faking on a personality questionnaire has been a challenging issue for researchers. It is like a police inspector arriving at a crime scene looking for clues to establish what happened. Before accusing someone of the crime, strong evidence is needed because circumstantial evidence bears the risk to convict innocents. However, in faking research, hardly any method of identifying those who have been deceptive would serve as strong evidence. Concurrently wrong accusations could have serious consequences for honest respondents, in terms of getting a job offer (Christiansen et al., 2010).

Insights into response process in faking
One approach to model faking behavior focused on differential item functioning (DIF) as a consequence of faking. This research yields evidence that people who fake respond to some items as honest participants do but have different response processes to other items. Studies into these differences have utilized unfolding item response theory models, which allow checking for (non-)monotonicity in item response functions (for more information see Roberts, Donoghue, & Laughlin, 2000). O'Brien and LaHuis (2011) found four different types of DIF in 57% of items. The different DIF types are based on different response models (e.g., monotonic response functions for applicants but not incumbents). Other research found DIF occurring even when there was only one response model assumed (Robie et al., 2001;Vecchione et al., 2012). Thus, faking behavior and response strategy depend on the content of an item and thus varies between items. Other approaches have examined how faking differs between individuals. Recent findings suggest that faking behavior varies both quantitatively and qualitatively among respondents (Ziegler et al., 2015). A qualitative classification concerns distinct response patterns that can be identified using mixed model IRT analysis (Zickar et al., 2004), while quantitative methods measure (latent) score differences between honest and faking conditions (Griffith et al., 2007;Ziegler & Buehner, 2012). However, Ziegler et al. (2015) compared classifications of both types, finding no significant correlation. Thus, we propose that both types account for different possibilities and styles of faking. For example, to get classified as having faked extremely in qualitative analysis, a respondent has to undermine the test's measurement properties so much that θ (person parameter estimate) no longer relates to option thresholds (Zickar et al., 2004). In quantitative analysis, only large deviations from honest scores are classified as extreme faking (Donovan, Dwight, & Schneider, 2014;Griffith et al., 2007). Therefore, ceiling effects occur as respondents scoring high on a trait cannot deviate as much as respondents with low honest scores. Qualitative classification is independent from true trait scores and might be more helpful here. Then again, respondents can maximize their score without undermining the measurement properties (Scherbaum et al., 2013) and thus can only be identified using quantitative classification. Hence, future research might benefit from combining both classification types and considering individual cases of faking.
The trouble of detecting faking on an individual level can be illustrated with the response latencies approach (Holden & Lambert, 2015;Komar et al., 2010). Holden and colleagues (1992) proposed a congruence model of faking. They proposed that faking as a response process towards a desired direction takes less time, while faking against that direction requires more time than honest responses. Figure 2 presents key findings of Holden and Lambert's (2015) examination of response latencies on a group level. Providing evidence for the congruence model, the latency differences were significant across honest, faking-good and faking-bad conditions. However, Holden and Lambert (2015) also examined how well these latencies can tell honest respondents from those who faked, finding that 37-44 % of honest respondents were classified as having faked due to response latencies. Furthermore, Komar et al. (2010) tried to reduce faking by speeding up test assessment and to thereby level response latencies. However, speeding up the assessment had no effect on faking behavior. Thus, such methods can be considered as a part of a mosaic revealing insights into response processes but yet are not precise enough on an individual level.

Discussion
Applicant faking behavior is a collective term for all behaviors during selection procedures that contain intentional responses to a self-reported personality measure which do not correspond to the true self-image. Our findings suggest that it is a construct (distinct from socially desirable responding) to describe and predict how applicants react to personality tests. In recent years, there has been a development of theoretical conceptualizations of faking behavior. Most theories emphasize the necessity of applicants to be both motivated and able to fake (Goffin & Boyd, 2009;Kuncel et al., 2011;Marcus, 2009) and that due to different motivations and abilities, distinct faking types and strategies can be observed Ziegler, 2011;Ziegler et al., 2015). Insights from qualitative studies have led to a step-by-step model of cognitive response processes Robie et al., 2007;Ziegler, 2011), suggesting that response strategies are varying and complex. We found that researchers seldom take this variation into account, which may in part be responsible for inconclusive results.

The (lacking) usage of theory in practical research
Despite the distinct concept of AFB and the welldeveloped theoretical framework, recent faking studies continue to base their research on working definitions, like social desirability responding (Paunonen & LeBel, 2012), impression management (Fan et al., 2012) or self-presentation (Jansen, König, Stadelmann, & Kleinmann, 2012). These working definitions often differ a lot from the concept of AFB. For example, Marcus (2009) gives a definition of self-presentation: "any conscious or unconscious attempt to control impressions on partners in social interactions" (p. 418). What would be the opposite? How would an applicant act in selection procedures, without leaving an impression? This definition subsumes every possible applicant behavior and therefore describes a broader concept than faking theories address. However, social desirability responding, as well as, the conscious part of it, impression management (Paulhus, 1984), have also been criticized for having little in common with actual faking behavior (Burns & Christiansen, 2011;Tett & Christiansen, 2007). Another issue of practical research is the simplification of faking as an attempt of score maximization. Respondents are classified by the amount their test score rises between honest and application conditions (Donovan et al., 2014;Griffith et al., 2007;Peterson, Griffith, & Converse, 2009). The bigger the score difference the more extreme the faking. However, as stated before, even response strategies of extreme faking do not necessarily result in maximum scores. For example, Figure 3 presents a notional response pattern of an applicant under an honest and a faking condition. Let us suppose the applicant deems item 2, 3, and 5 irrelevant for the job and therefore chooses neutral options (as suggested by Ziegler, 2011). While the answers to all items change and follow an extreme faking strategy, the overall test score declines. Thus, as faking takes place on an item level and with several response strategies, the approach to define faking as attempt to maximize one's test score appears to be oversimplified.
Furthermore, empirical studies not accounting for faking theory may be influenced by researchers' implicit theories of faking and should therefore be compared and interpreted with caution. For example, Donovan et al.  Item 5

Honest condition
Applicant condition (2014) discuss two studies (Hogan, Barrett, &Hogan, 2007 andEllingson, Sackett, &Connelly, 2007) which found faking not to occur in real personnel selection, stating that both made questionable assumptions on the nature of faking. More precisely, Hogan et al. (2007) assumed that applicants rejected by a company after an initial selection procedure would present themselves more positively when reapplying to the same company. In short, they would fake more extremely and get different test scores. The authors implied that the difference in the test scores was equivalent to a difference in test scores found when comparing honest and application conditions. Such assumptions contrast with present theoretical knowledge, because facets of motivation to fake (e.g., need for a job) hardly change when reapplying for the same job after a short period of time. The (implicit) assumption that faking on seven-point Likert scales is comparable to faking on dichotomous items (e.g., Ellingson et al., 2007;Hogan et al., 2007) is also questionable, as different response strategies are underlying, e.g., because there is no middle category (Ziegler, 2011). As such studies do not refer to the current state of faking theory, their results can seem inconclusive just by being based on (opposing) implicit theories. By contrast, as shown here, faking theory can illuminate such inconsistencies with ease.
Regarding modeling and detection of AFB, we found approaches that take the complexity of response processes on an item-level into account. With IRT and SEM models, researchers have managed to give insights into individual response strategies, showing that faking varies both quantitatively and qualitatively among respondents (Ziegler et al., 2015), and that it depends on item content (O'Brien & LaHuis, 2011). Thus, we suggest that the combined use of qualitative and quantitative modeling techniques does suit the current understanding of AFB best, and should therefore be employed more often in future research.

Threats and drawbacks of frequently used detection methods
Despite the existence of statistical procedures, that yield sophisticated insights into individual response processes (i.e., IRT and SEM), SDR scales are included in 85% of personality tests used in personnel selection (Goffin & Christiansen, 2003). However, these scales have been criticized for being correlated with personality traits and fakeable themselves (Burns & Christiansen, 2011;Ones et al., 1996). Thus, the inability of SDR scales to detect faking strongly contrasts their actual spread in personnel selection.
Further, the use of SDR scales appeals to us as an ethical issue, because they cannot detect applicant faking behavior but are sometimes used dominantly to decide if an applicant gets a job offer (Christiansen et al., 2010). Currently, a number of methods of measuring faking exist (for an overview, see Christiansen, 2011 and, used in research as well as in personnel selection. O'Connell and colleagues (2011) demonstrated that different detectors of faking result in different classifications of respondents. Thus, whether an individual is classified as having faked does not necessarily depend on their actual faking behavior but on the method utilized for detection. As Tett and Christiansen (2007) concluded, research "is uninformative to the degree it relies on social desirability measures" (p. 982). We propose that this might hold true for most methods which do not account for the individual and item-related differences in actual faking behavior. While in research this is an issue of precision, in personnel selection it is an ethical issue of rejecting applicants based on erring conclusions.

Implications
Researchers and practitioners could benefit from examining applicant faking behavior as a concept distinct from socially desirable responding for two reasons. First, socially desirable responding scales are susceptible to faking and therefore an impractical measure to describe applicants' behavior. SDR observes whether a respondent meets a profile that is socially desired in general but this concept does not distinguish whether the desired profile is met deceptively or honestly. Furthermore, AFB accounts for specific characteristics of the situation and of the respondents. Its theoretical framework allows for a differentiated look on distinct faking types and response strategies, and thus, modeling techniques that provide deep insights and are hardly fakeable could be developed.
Therefore, we also disagree with Sackett (2011), who claimed that a precise and sound understanding of faking behavior is unnecessary as long as we have methods to prevent faking. On the one hand, we propose that prevention of applicant faking behavior can never be effective as long as we have no definition of the exact behaviors that are to be impeded. Effective prevention methods should be developed considering these behaviors. On the other hand, without a clear concept of faking, how can one evaluate whether faking has been prevented? Thus, a sound theoretical understanding of applicant faking behavior is necessary for both creating and evaluating prevention techniques.
This also holds true for the examination of consequences of faking. There has been some debate on AFB's impact on predictive validity of personality tests, and while some researchers found that validity was not affected (Christiansen, Goffin, Johnston, & Rothstein, 1994;Ones & Viswesvaran, 1998;Schmitt & Oswald, 2006), others suggested a significant impact (Douglas, McDaniel, & Snell, 1996;Schmit, Ryan, Stierwalt, & Powell, 1995;Topping & O'Gorman, 1997). Based on our review, we suggest that these inconclusive results might derive from inadequate operationalization. As we argued before, AFB and SDR can be considered as distinct concepts and therefore studying the consequences of faking through a lens of SDR scales (e.g., Schmitt & Oswald, 2006) does not provide helpful insight and should be avoided.
Lastly, our findings yield implications for an ethical debate. Some scholars argued that the wish to make a good impression or to put one's best foot forward can be considered natural and adaptive to a social world (Morgeson et al., 2007). Therefore, terms like faking or lying are too negatively connoted (Marcus, 2009). We pose that this debate could benefit from a more differentiated look on applicant faking behavior. As faking strategies and behaviors are both qualitatively and quantitatively distinguishable, it might be plausible to evaluate different types of slight faking as socially adapted and still useful for selection decisions, while different types of extreme faking, yielding no information about one's true self, could be evaluated differently.

Limitations
Our goal was to gain an overview of the concept of applicant faking behavior, its current state of theory, and how it can be distinguished from related constructs like SDR. Our definition and distinction of AFB was limited to examining faking in personality tests. As personality tests usually form a part of a selection procedure, faking in a personality test could also be correlated to faking elsewhere in the recruitment process. Thus, faking could be considered as a general behavior throughout selection process. While there are also theories which are applicable for faking in interviews and other selection methods (e.g., Marcus, 2009) we were unable to find research on this relationship. However, we pose that faking on personality tests also has unique techniques and facets and can therefore easily be examined as a distinct concept.
The definition of AFB provided in this review considers deviations from the true self-image. However, developments in personality test research suggest that deviations from the self-image also rely on self-deception (Sackett, 2011). There might be more than just one true self-image, as investigated in frame-of-reference research (Schmit et al., 1995). Self-deception is described as an unconscious deviation from one's true trait score (e.g., thinking to be less extraverted than you really are). Instead, frameof-reference effects imply that the true trait score itself varies among situations (e.g., people are more or less extraverted depending on the situation). Sackett (2011) combined both concepts, posing that true trait score as well as self-deception vary among situations. However, our definition does not account for frame-of-reference or self-deceptive variance by including only one true self-image.
Regarding the techniques used to model and detect AFB, it would have been beyond the scope of this review to discuss all methods presented in the literature and so the review was limited to just a few representative techniques. This does not mean that other methods are impractical. Rather, we think the combination and integration of different techniques would allow a deeper insight into response processes, such as in the combination of IRT and SEM analyses by Ziegler and colleagues (2015). While we criticized the approach of response latencies for being too imprecise on an individual level, we do acknowledge that it sheds light on a physical component of faking behavior that cannot be detected through other methods. Using bivariate generalized linear item response theory (Molenaar, Tuerlinckx, & van der Maas, 2015), response latencies could be integrated to an IRT model of faking to gain even more insight into the relation between faking and time latencies. Other modeling techniques of faking could also benefit from integration into a wider methodological framework.

Conclusions
The present review suggests that applicant faking behavior is an important and wide field of research. However, it should not be forgotten that faking is first and foremost a practical issue, and therefore, research should also supply useful answers to practitioners in personnel selection. We think the consideration of applicant faking behavior, as a distinct, individual, and multidimensional response process, provides helpful information for practitioners to evaluate and handle faking in real-applicant situations as it gives a differentiated perspective on applicants' response strategies and underlying motivations and abilities. Future research should consider this perspective more often, for in the end the decisions about selection procedures are made in practice. This might lend a hand in bridging the scientist-practitioner gap (cf. Anderson, 2007;Cohen, 2007).