Chapter 3: Data Collection and Storage
Module 10: Dealing with Deception
Deception in research refers to fake participation, either by individuals using misleading or false information, or by bots inputting automated information. In the context of online survey research, deception is very prevalent and can lead to wasted resources and opportunities, and create significant bias and distortion in data. Let’s go through how to identify red flags for deception and how to mitigate it!
Learning Objectives
- Identify red flags of deception and fraudulent participation
- Create strategies to prevent and resolve deception in research
Case Study
Dr. Sharp is conducting a study on digital accessibility and offers a $20 honorarium for completing a survey. After launching, he notices an influx of responses from participants with unusual answer patterns and vague responses. Some even claim identical, rare disabilities. Dr. Sharp decides to reach out to one of these participants via email, to which he receives no response.Ā
Types of Online Survey-Based Fraud
Have you ever been scrolling through social media and found a fake account pretending to be real? Colloquially called bots, these accounts may be used for a variety of purposesā inflating follower counts, posting inflammatory content, or even promoting fraud. These bots also appear in online survey research, and cause similarly negative impacts, especially by skewing data inaccurately, introducing biases, and taking participation opportunities away from real humans.
While bots are a concern, they are only one type of fraudulent participant. So, what qualifies as deception? What qualifies as a āfraudulent participantā? A fraudulent participant is an individual who provides misleading, false, or duplicate responses.
Types of fraudulent participants include bots, duplicate takers, alias scammers, response distorters, and careless responders. Here are some quick definitions of these forms of fraudulent participants in the context of online survey research:
Bots | Multiple Takers | Alias Scammers | Response Distorters | Careless Responders |
Software that can fill in survey data automatically and excessively. | People who take the survey more than once to incur more incentives or to purposefully distort data. | People using false identities to take surveys, to claim incentives. They may be genuinely responding, but are still acting under false identity. | People who are purposefully entering data to trick or twist results, in order to impact researchers, claim incentives, or distort data. They may or may not be truly eligible to participate. | Participants who are eligible to participate, but respond to questions with low attention, often leaving responses blank or poorly completed. |
While some fraud is intentional, other cases stem from misunderstanding questions or accessibility barriers.
For example, some participants may struggle with typing or structured responses due to their disabilities. Instead of relying solely on rigid exclusion criteria, researchers should employ thoughtful review processes to distinguish genuine errors from deliberate fraud, ensuring that accessibility is not compromised in the pursuit of data integrity.
Identifying and Addressing Deception in Research
Identifying red flags for deception in digital research is crucial to identifying fraudulent participants. Take a look at this table on common red flags:
Suspicious Contact Information
|
Participant’s Expression of Interest in Study
|
Odd Responses
|
Use of AI for writing fraudulent responses
|
Distant Participants
|
Urgency Related To ReimbursementĀ
|
Responding to Deception in Research
Now that we can flag deceptive responses, what can we do to prevent or address them?
Location
If your research focuses around a specific region (for example, Canada, Ontario, Toronto), you may consider location-based verification. This might include collecting postal codes, phone numbers or IP addresses. However, this verification style can raise privacy concerns, as some legitimate participants may be discouraged from participating in digital research that collects too much personal information. It is also possible that multiple participants may have the same IP address if they are using a public computer.
Survey Design
By designing a survey that was made to deter fraudulent participants, you can maintain a higher level of response integrity. These include patient and public involvement to help balance participants’ needs, including traps for bots hidden in the questionnaires, questions hidden from human participants, but not from bots, attention check questions where the respondent is asked to choose a specific option, and illogical options on multiple choice questions.
Added security measures against bots can include using CAPTCHA tests, which are available on certain survey platforms, including REDCap. However, some users who require screen-reader devices describe these CAPTCHA tests as sometimes being inaccessible. Ensure that you pilot these security measures with a diverse group of people with disabilities before a full rollout of your survey.
Incentives
As mentioned in Module 8: Informed Consent and Assent , disproportionately large incentives are problematic because they can increase the chances of undue influence. In general, aim for reimbursements that would be difficult or impossible for fraudsters or bots to redeem. For example, you may send reimbursements through the mail to the address they provided, or to an individual email address or phone number. Checking to see if multiple responses are recorded from the same address, email, or phone number is important to make sure fraudsters cannot receive multiple incentives. Ultimately, providing online or digital incentives is risky, so creating non-monetary or raffle-style incentives may be a better option.
Data Checking
Conducting a quality check for survey data is a good way to exclude deceptive responses. Some survey platforms have excellent capabilities for conducting data quality checking; and sometimes you may have to check data quality on your own. These data quality rules can be set by the research team, where certain combinations of responses, or lack thereof, may indicate a fraudulent participant. For example, a survey may have 10 multiple choice questions, each with 5 possible options, and an option of āPrefer not to respondā.
An example of a rule for data quality might be that any participant who has responded to more than 5 questions with āPrefer not to respondā, should be flagged for review. Some systems, such as the REDCap survey platform, are able to automate this check and flag all participants. In other cases, researchers may have to do this manually, by scanning responses for red flags.
Identity Validation
While collecting extensive personal data is an option to tackle issues of identity validation, there are ethical considerations to make note of, such as concerns about identifiability based on combined data, and data protection and security. Researchers should prioritize clear communication, informing participants how their responses are validated and providing an appeal process for flagged entries. Ethical fraud prevention also includes avoiding harmful assumptionsāfor example, participants giving unexpected responses, using poor grammar, or those with accents should not be automatically labeled as fraudulent.
Balancing Security with Accessibility and Inclusion
While fraud prevention is necessary, overly strict security measures can create barriers for legitimate participants, especially those with disabilities or limited digital literacy. CAPTCHA tests, strict time limits, and advanced verification steps can unintentionally exclude people who use assistive technology, have cognitive disabilities, or face internet instability.
Inclusive security methods, such as human review of de-identified suspicious responses, clear instructions on verification steps, and flexible participation options, are great ways to do your due diligence as a researcher without indiscriminately hard stops.
Itās also important not to assume that all inconsistencies are fraud; researchers should explore whether accessibility barriers are contributing to unusual response patterns. A nuanced approach ensures that genuine participants are not unfairly filtered out while fraudulent entries are minimized.
Module 10 Activity 1
The following article has some great information about fraudulent participants, or “fraudsters”! Please feel free to use this resource, which was published by Johnson, Adams and Byrne (2023), titled “Addressing fraudulent responses in online surveys: Insights from a web-based participatory mapping study”.
Feedback/Errata