Overview
HackerRank ensures that your Tests contain questions not leaked elsewhere on the internet. HackerRank leak detection tool performs an automated search at regular intervals to identify questions from the library that get leaked.
In addition, if you come across a question available elsewhere, you can report the question to HackerRank. Look at the article Report a Question Leak to learn more about how to report a leaked question.
The leaked questions across the library (both HackerRank and your company library) are displayed on the Leaked tab of the library.
Note: HackerRank's current leak detection capabilities are limited to Coding, Approximate Solution, Database, DevOps, and Fullstack question types.
What are Leaked Questions?
If you search for a question from the HackerRank library on the Internet, and if it appears on the search results, matching structurally and semantically with the HackerRank library question, it is termed a leaked question.
Viewing the Leaked Questions
- In your HackerRank for Work account, click on the Library option from the top menu.
- Once you are inside the Library page, click the Leaked tab. In the Leaked Questions tab, you will find all the questions marked as leaked in your library. A 'leaked' question is displayed with a 'red triangle' .
- You can filter these questions by My Company or HackerRank questions by choosing the desired filter on the left pane.
- You might have used one or more of these leaked questions in your tests. There is an additional filter, Hide Questions with No Test Impact, which helps you identify those questions and take necessary actions by filtering out the rest. This filter is selected by default.
-
You can click on any of the leaked questions to view the following details:
- Possible Matches: You can view the details of the website where the question or question fragment is available. You can visit that website by clicking on View under the Action column. The Confidence column indicates the chances of that question being leaked.
- Tests Impacted: You can check the tests where the question marked as leaked is used. You can click on View to navigate to that test and remove that question from the test.
Handling Leaked Questions
When you see that a library question is leaked, it is best not to remove it immediately. But follow certain change management best practices.
Here are some best practices you can follow when you encounter a leaked question:
- If you find a leaked question in the library or during test creation, you should select the Hide Leaked Questions check box and use unleaked questions in tests.
- If an existing test has a question that is marked as leaked, you can:
- Remove that question from the test or replace it with a similar unleaked question using the Replace Questions functionality.
- If the test is in progress, you can use your discretion to monitor the plagiarism checker for that question more efficiently. If you would like more information about plagiarism detection, you can see here. You can also contact support @ hackerrank.com for troubleshooting.
- If you find that a completed test contains a leaked question, you can choose to disregard the question while tabulating scores from the candidate's attempts.
- You can also consider modifying the problem statement (since that's mostly what the tool matches similar content for) so it's not extremely generic or readily available on the internet.
Note: The tool highlights whether or not a question is leaked and checks for leaked content everyday. The results are almost perfectly accurate, and HackerRank employs due diligence while dealing with 'leaked' content. You should do the same based on the nature of the question(s).
How HackerRank Works on Library Leakage Detection and Mitigation
HackerRank aims to focus on hiring decisions based on skill. Developers can showcase their skills in a fair and equitable testing environment. The integrity of the questions that comprise these tests is critical for developers and employers to feel confident in their fairness and efficacy.
Understanding HackerRank's Leak Detection
HackerRank has its own in-house tool that scrapes the internet to find potential leaks online. Questions that match a certain threshold are marked as leaked and hidden from the default library.
-
How does this tool work?
The tool performs an automated Google search of the Problem Statement. It returns a list of websites where it finds parts of the statement. Each URL is visited and scraped for its content. The matching unit consists of multiple string matching algorithms. Anything above a certain threshold of the matching score is considered leaked. -
How often is the leak detection run?
HackerRank runs this tool consistently, once everyday. -
Are there any questions that are leaked but not flagged?
While HackerRank aims to have a very high accuracy of leakage detection, there could be instances of leakage that the system may not flag. Please feel free to report it here.
Understanding HackerRank's Mitigation
Question leaks compromise the integrity of the test questions, thus making it difficult for companies to identify developers with the most vital technical skills. HackerRank ensures that the question integrity is maintained by monitoring and containing the leaked questions.
- Web pages that, in reposting content, also infringe on the company’s logo and name.
- For-profit domains such as course or training websites that infringe HackerRank questions to drive transactions or user signups.
- Online content that copies the default code stubs provided with each question.
The United States copyright and DMCA laws protect content from HackerRank’s question library. This allows HackerRank to issue DMCA takedown notices where users or websites infringed on HackerRank's copyright in violation of the fair usage doctrine. A proactive DMCA policy is the best way to ensure that the assessments are a fair and effective way for developers to demonstrate their skills.
HackerRank's online leak detection system proactively identifies leaked questions to detect a possible leakage on the internet. Then, a manual review is performed for every detection on a case-by-case basis. HackerRank only sends a DMCA takedown notice when it has ensured the accuracy and considered how the decision impacts developer communities.
HackerRank’s DMCA takedowns follow a four-step process of identification, prioritization, manual review, and communication. Read more on how HackerRank takes down leaked content from the web.
Identification:
- HackerRank actively monitors for leaked content by using automation to detect infringing material.
- HackerRank then records the domain, question ID, and question creation date of each flagged URL.
Prioritization:
HackerRank prioritizes its DMCA responses based on the severity of the breach and the type of domain, in addition to a range of other factors. The priority on the takedown of pages is done with one or more of the following characteristics:
- Web pages that flagrantly post screenshots and a verbatim copy of question text of regularly used questions.
HackerRank deprioritizes takedowns of pages created by developer communities that publish content and developer-created solutions for learning and educational purposes.
Review:
- HackerRank’s product content team conducts manual, in-house reviews of every flagged URL to ensure no false positives are made.
- Multiple reviewers assess each URL to ensure a fair and accurate review process.
- If a false positive is identified, the question is marked as not leaked, whitelist the URLs.
- If a case does violate HackerRank’s copyright and DMCA policy, the content is advanced to the next stage.
Communication:
- HackerRank sends a DMCA notice to the online service provider if a page warrants a takedown.
-
The notice will always contain the following:
- Why is the recipient receiving the notice
- How to respond to the notice
- The required response timeline
- How to dispute a DMCA takedown
HackerRank's Library Refresh
Based on the leakage, HackerRank backfills the library with questions each quarter to have a pool of questions for each skill. In the first three quarters of 2022, HackerRank added an average of 435 questions to the library each quarter (in the enterprise plan). These questions were across hands-on projects, SQL, and coding questions to address popular content requests from customers, backfilling to address leakage of questions from the HackerRank library, and refresh the library with new content across diverse skills.
These additional questions enable customers to replace questions leaked in tests.