Resume parsing, also known as CV parsing, resume extraction, or CV extraction, allows for the automated storage and analysis of resume data. The resume is imported into parsing software and the information is extracted so that it can be sorted and searched.
Video Résumé parsing
Description
Resume parsers analyze a resume, extract the desired information, and insert the information into a database with a unique entry for each candidate. Once the resume has been analyzed, a recruiter can search the database for keywords and phrases and get a list of relevant candidates. Many parsers support semantic search, which adds context to the search terms and tries to understand intent in order to make the results more reliable and comprehensive. The candidates returned are ranked based on how closely they match the keywords and job profile.
Machine learning
Machine learning is extremely important for resume parsing. Each block of information needs to be given a label and sorted into the correct category, whether that's education, work history, or contact information. Rule-based parsers use a predefined set of rules to parse the text. This method does not work for resumes because the parser needs to "understand the context in which words occur and the relationship between them." For example, if the word "Harvey" appears on a resume, it could be the name of an applicant, refer to the college Harvey Mudd, or reference the company Harvey & Company LLC. The abbreviation MD could mean "Medical Doctor" or "Maryland". A rule-based parser would require incredibly complex rules to account for all the ambiguity and would provide limited coverage.
This leads us to Machine Learning and specifically Natural Language Processing (NLP). NLP is a branch of Artificial Intelligence and it uses Machine Learning to understand content and context as well as make predictions. Many of the features of NLP are extremely important in resume parsing. Acronym normalization and tagging accounts for the different possible formats of acronyms and normalizes them. Lemmatization reduces words to their root using a language dictionary and Stemming removes "s", "ing", etc. Entity extraction uses regex expressions, dictionaries, statistical analysis and complex pattern-based extraction to identify people, places, companies, phone numbers, email addresses, important phrases and more.
Maps Résumé parsing
Effectiveness
Resume parsers have achieved up to 95% accuracy, which refers to the accuracy of data entry and categorizing the data correctly. Human accuracy is typically not greater than 96%, so the resume parsers have achieved "near human accuracy."
One executive recruiting company tested three resume parsers and humans to compare the accuracy in data entry. They ran 1000 resumes through the resume parsing software and had humans manually parse and enter the data. The company brought in a third party to evaluate how the humans did compared to the software. They found that the results from the resume parsers were more comprehensive and had fewer mistakes. The humans did not enter all the information on the resumes and occasionally misspelled words or wrote incorrect numbers.
In a 2012 experiment, a resume for the an ideal candidate was created based on the job description for a clinical scientist position. After going through the parser, one of the candidate's work experiences was completely lost due to the date being listed before the employer. The parser also didn't catch several educational degrees. The result was that the candidate received a relevance ranking of only 43%. If this had been a real candidate's resume, they wouldn't have moved on to the next step even though they were qualified for the position. It would be helpful if a similar study was conducted on current resume parsers to see if there have been any improvements over the past few years.
Benefits
- A famous study conducted by Marianne Bertrand and Sendhil Mullainathan in 2003 looked at whether candidates with the names Emily and Greg were more employable than Lakisha and Jamal. The conclusion was that resumes with white-sounding names received 50% more callbacks than ones with black-sounding names. In 2014, a study was done in Australia and New Zealand to investigate name discrimination based on gender. Insync Surveys, a research firm and Hays, a recruitment specialist sent out a resume to 1,029 hiring managers with the name being the only difference. Half the hiring managers received a resume for Simon Cook and the other half got a resume for Susan Campbell. The study found that Simon was more likely to get a callback. Resume parsing allows candidates to be ranked based on objective information and can help prevent the bias that so easily shows up in the hiring process. The software can be programmed to ignore and hide factors that contribute to bias such as name, gender, race, age, address and more.
- The technology is extremely cost-effective and a resource saver. Rather than asking candidates to manually enter the information, which could discourage them from applying or wasting recruiter's time, data entry is now done automatically.
- The contact information, relevant skills, work history, educational background and more specific information about the candidate is easily accessible.
- The applicant screening process is now significantly faster and more efficient. Instead of having to look at every resume, recruiters can filter them by specific characteristics, sort and search them. This allows recruiters to move through the interview process and fill positions at a faster rate.
- One of the biggest complaints people searching for jobs have is the length of the application process. With resume parsers, the process is now faster and candidates have an improved experience.
- The technology helps prevent qualified candidates from slipping through the cracks. On average, a recruiter spends 6 seconds looking at a resume. When a recruiter is looking through hundreds or thousands of them, it can be easy to miss or lose track of potential candidates.
- Once a candidate's resume has been analyzed, their information remains in the database. If a position comes up that they are qualified for, but haven't applied to, the company still has their information and can reach out to them.
- Since the technology has already gotten so efficient, many companies are allowing applicants to apply just using their LinkedIn profile.
Challenges
The parsing software has to rely on complex rules and statistical algorithms to correctly capture the desired information in the resumes. There are many variations of writing style, word choice, syntax, etc. and the same word can have multiple meanings. The date alone can be written hundreds of different ways. It is still a challenge for these resume parsers to account for all the ambiguity. Natural Language Processing and Artificial Intelligence still have a way to go in understanding context-based information and what humans mean to convey in written language. One company that offers a resume parser includes in the description of the product that "Resume parsing is rarely perfect."
Resume optimization
Resume parsers have become so omnipresent that rather than writing to a recruiter, candidates should focus on writing to the parsing system. Understanding how they work is a great first step, but there are also specific changes an applicant can make to optimize their resume. Here are some tips on how to do that:
- Use keywords from the job description in relevant places on your resume. These keywords will almost certainly be included in the parsing process.
- Don't use headers or footers. They tend to confuse the parsing algorithms.
- Use a simple style for fonts, layouts and formatting.
- Avoid graphics.
- Use standard section names such as "Work Experience" and "Education".
- Avoid using acronyms unless they're included in the job description. The safest option may be to write the long form and include the acronym after in parentheses.
- Don't start with dates in the "Work Experience" section. Parsers typically look for dates following job titles or company names.
- Stay consistent with formatting past work experience. The standard is job title, company title, and then employment dates.
- Most resume parsers claim to work with all of the main file types, but stick with docx, doc and pdf to be on the safe side.
Software and vendors
There are many options for resume parsers including BurningGlass, Sovren, Daxtra, HireAbility, Rchilli, TextKernel, Trovix and RapidParser. Resume parsers are also typically bundled in with Applicant Tracking Systems, which are used by companies to streamline the hiring process. 90% of Fortune 500 companies use Applicant Tracking Systems and they can do everything from processing job applications, managing the recruiting process and executing the hiring decision.
When choosing a resume parser, it is important to look at coverage. All resume parsers extract the basics such as skills, education and work experience. However, the more advanced parsers are also able to extract desired salary and location, hobbies, references and more. The other important consideration is accuracy. If the accuracy of the parser is below 90%, then the benefits are not as relevant due to the cost associated with supervising data entry and fixing errors.
Future
Resume parsers are already standard in most mid- to large-sized companies and this trend will continue as the parsers become even more affordable.
A qualified candidate's resume can be ignored if it is not formatted the proper way or doesn't contain specific keywords or phrases. As Machine Learning and Natural Language Processing get better, so will the accuracy of resume parsers.
One of the areas resume parsing software is working on expanding into is performing contextual analysis on the information in the resume rather than purely extracting it. One employee at a parsing company said "a parser needs to classify data, enrich it with knowledge from other sources, normalize data so it can be used for analysis and allow for better searching."
Parsing companies are also being asked to expand beyond just resumes or even LinkedIn profiles. They are working on extracting information from industry-specific sites such as GitHub and social media profiles.
References
Source of the article : Wikipedia