We help companies in making the right hiring decision. Our goal is to identify candidate submissions with the likelihood of plagiarism by determining if codes are similar. We optimize for candidate experience and reduce false alarms so that we do not penalize any innocent candidate.
Our plagiarism flag is an indicator that someone has copied the code. Although we detect code similarity, we cannot determine the reason for code similarity. This is to help you in saving time as we point out the cases that are worth a detailed examination. We recommend that a developer should review the highlighted code to make a decision if this is an actual case of plagiarism or not. We do not recommend auto-rejecting a candidate based on the plagiarism flag.
We use two algorithms for detecting plagiarism: Moss (Measure of Software Similarity) and String comparison. We find similarity in the code using both String comparison and Moss and then use a minimum of both for smaller lines of code. For longer code submissions (that contain more number lines), we give precedence to Moss over String.
Moss is an improved algorithm where it tokenizes the code and the tokenized versions of all candidates' source code are compared to identify pairs of document which have substantial overlap. Some candidates try to change the variable name or introduce white spaces to deceive plagiarism detection. Moss typically does not work in their favor because the structure of the program is unchanged, and the number of token and line matches between the documents is still the same.
The following screenshot shows how Moss is able to detect candidates who have the same structure of code and logic used but changed the variable names, used for loop instead of while loop. Moss is powerful enough that it checks for similarity in structure, strips off all the variables and checks for code similarity, and helps in catching plagiarism.