How an English literature graduate built an advanced NLP engine: story of Priyadarshi Lahiri
Our Techie Tuesdays’ candidate of the week is Priyadarshi Lahiri, CTO of Edge Networks. Though he did not take the usual academic path of engineers, that dint stop him from building one of the most advanced NLP engines.
Priyadarshi is the only one in his tech team without an engineering degree. In fact, he’s the CTO of the company where he works. His love for literature led him to pursue a Bachelor of Arts, but that was accompanied by a decade-long tryst with coding. In fact, while enjoying Shakespeare in college, Priyadarshi was already taking freelance projects in tech. Fast forward a decade, Priyadarshi got an opportunity to combine his love for literature and technology by building an NLP engine. Powered by deep learning today, it’s a recruiters’ paradise.
Priyadarshi is our Techie Tuesdays’ candidate for the week. Here’s an interesting story of a kid who started working on computers at the age of five and just didn’t stop for the next three decades.
Youngest student in Kolkata to be provided an internet connection
Priyadarshi was born and brought up in Kolkata. His father was an entrepreneur with his own marketing practice in the city. Priyadarshi first got access to a computer when he was five. It was an IBM PC XT with 12 MHz processor, 20MB hard drive, 640kb RAM and a monitor that could display 64 shades of grey. He started scripting with BASIC the same year. He recalls, “I was working on WordStar processor, Lotus 123 spreadsheet and dBase data base programs which were the three pillars of working with computers.”
Priyadarshi’s first brush with online world was with Bulletin Board System (BBS). It was all in a text-based interface. When he was in Class VI, VSNL Videsh Sanchar Nigam Limited (VSNL) introduced internet in Kolkata. Unfortunately, only students of Class X or above could apply for a student account, and the other option was quite costly. So, Priyadarshi wrote a letter to the Chief General Manager (CGM) of VSNL, Kolkata, stating his disappointment over the rule. He recalls, “I said, I did not understand why there was an age cap on this. I knew how internet worked, I had worked on BBS and had a pretty good knowledge of computers.” His letter invoked a response from the CGM and he was invited to his office. When he showed his skills to the CGM in a room full of VSNL employees, he was awarded the connection and he became the youngest student in Kolkata to be provided an internet connection.
12-year-old teacher for 30-year-old students
Priyadarshi learnt HTML immediately after getting an internet connection. He set up his presence in Geocities, a free website hosting provider. That’s how he also came to know about a training institute in Kolkata which was looking for someone who could teach HTML. At that time, he was still in class VII. He went to the centre and signed up for the job. He says,
The institute didn't inform the trainees/students who were typically 20-40 years of age, that a kid was to teach them; they were quite shocked to see me on day one and some of them got up and walked out. However, things changed from the second day when I taught them how to craft web pages. They even felt apologetic about their behaviour.
Turning into an entrepreneur
Priyadarshi and his father started a business in Kolkata called PostHaste. It helped people communicate with their relatives/friends in the US in an economical way. ISD call rates were very high then. Priyadarshi architected a system with five phone lines connected to a fax machine. He explains, “Anyone could go to a partner PCO (Public Call Office), write out a letter, put an email address on it, and fax it to our central hub. We would then put the fax on the flatbed scanner, OCR it and then dispatch it by an email. We had the central email address where the replies used to come back which we used to fax back to the PCOs. The concerned person could then collect it from the PCO. This allowed people to have long conversations without prohibitive charges (on telephone).”
Few years later, until cyber cafes became popular, PostHaste ran well. Later on, Priyadarshi took up contracts to set up cyber cafes. He also took up a project to build aapkidukaan.com, an e-commerce portal without online payments support system. Users could browse inventory of local stores around them and order (for which the offline payment had to be made once delivery took place). Priyadarshi built it on ASP. It had Java applet with the tree structure for product categorisation. However, right before his class X exam, Priyadarshi paused all his freelance work.
Discovering his love for literature
Priyadarshi was bad at Chemistry; hence, he couldn't opt for the Science stream in class XI. He went for the next best option — Commerce. But he couldn’t do well in accounting. He adds, “All these years, I developed a keen love for literature. I liked Shakespearean plays, wrote prose and poetry.”
After class XII, while his father wanted him to pursue Business Studies, Priyadarshi wanted to study literature. He always knew that he could continue with his hobby of coding and working on computers. So, he simply took admission in BA in literature at Seth Anandram Jaipuria College. He says, “It really helps me today with my work as I understand how language, grammar, annotation work and how you interpret a sentence and break it up. What we do today largely deals with Natural Language Processing (NLP) which is the art of making the computer understand the intent of human text.”
Breaking the bond with Kolkata
While in college, Priyadarshi resumed working on freelance projects. Unfortunately, after his first year of college, he lost his mother. He says, “I wasn't used to being in Kolkata without her. So, after college, when I got an opportunity to work in Bengaluru, I took that up. It was a conscious decision to leave my life behind in Kolkata and start afresh.”
After graduation, Priyadarshi first got married and later joined customer contact centre of HP in Bengaluru. At HP, he was a part of the rescue team, which was one level above the technical support team. When HP acquired EDS in 2008, Priyadarshi worked on transitioning EDS support centres from being business oriented to being technically oriented. He then set up the first analytics practice for the customer contact centre. He coded it from scratch, using PHP.
Searching for the ‘one’ (job) — ThoughtNet, Freshersworld, Sportskeeda
After three years at HP, Priyadarshi was itching to join a software engineering team and that took him to ThoughtNet Technologies. They had a modular learning management system at the core and Priyadarshi’s job was to make the platform better and to scale it. The backend was Drupal and front end was built using Adobe Flex (now Apache Flex). After one year, he left ThoughtNet to work for a large enterprise as a technical architect, only to realise that he didn’t belong there.
When Priyadarshi met Joby Joseph, Founder and CEO of Freshersworld, they were looking to solve the problem of scale. Priyadarshi adds, “The company had a sizeable platform back then. It was primarily on Drupal, something I knew and dealt with.” Freshersworld had a system of mailers going to a huge database of users with no real segregation. Priyadarshi solved the challenge for recruiters to drive the participation for walk-in interviews.
Initially, the mailing system was single thread. So, it took almost a week to send an email to the entire population of job seekers. Priyadarshi solved the problem by building a framework using AWS which had just launched their API-driven tractional mail service SES.
While working at Freshersworld, Priyadarshi met Porush Jain, Founder of Sportskeeda, in 2011. With the Cricket World Cup approaching, Porush was keen to build a chat platform for users (Chatkeeda) which could handle a lot of traffic. Priyadarshi says,
The key issue was that they wanted users to have access to the chat history by simply scrolling up. SQL was ruled out because of scaling issues. So, I chose Redis. It lets you store anything in a sorted set so that the chronology is maintained and it remains memory-based.
Performance in Redis comes from the fact that you're able to possess most of what you'll be accessing in memory and the rest goes to the disk. For Chatkeeda, it worked well, scaled well, didn't crash and full chat history was available to the users on scrolling up and it loaded very fast.
Literature and technology: a match made in heaven
Arjun Pratap, Founder and CEO of Edge Networks, approached Priyadarshi with his idea to build a platform where students could come in and create their profiles (for jobs). Given his experience in this domain, Priyadarshi told Arjun that building the platform while keeping forms in mind might not be a great idea, as people prefer to sign up for a portal where they can get opportunities by simply uploading their resumes (and not filling out a long form). Priyadarshi had just started working on NLP then, and proceeded with the building of an engine that would match the job description and resumes. He explains,
The core engine works to get the right person for the right job. A company like Wipro had a huge problem with filtering (the applicants) when dealing with thousands of resumes for a job description.
When he ran the Edge Networks product (JScore) through three job positions at Wipro, it came up with 80 percent accuracy on the algorithm. There were two parts to the system:
- An NLP engine would process the job description scanning its noun phrases and understanding them. It used the computational linguistics to understand the formation of a sentence.
- The engine also crawled through the websites on internet, to understand the text further. It would then correlate any two connected things.
From a job description, the NLP engine could deduce the skills, abilities, functions, traits that can be ranked in order of preferences for a job (based on accepted resumes). The engine evolved with time and after going through the background, projects, skills, chronological career growth of the employees, it now understands the allocation of jobs and also predicts the career graph of an employee/applicant.
Tech stack at Edge Networks
The backend is built on PHP with most of the computational linguistics written on Python. Geo-location is included as well. Priyadarshi says,
If I had to go to Wipro, process 5,000 resumes and still score in 200 milliseconds, then I needed to architect it that way.
Today's Wipro's entire people supply chain application runs on top of Edge Networks’ APIs. Priyadarshi used PHP at the backend, Python as data science services, and R (programming language) for analytics.
Priyadarshi believes that following a few critical points defined technical success for Edge Networks:
- Moving on from PHP at backend to Python so that it was cohesive across the stack.
- Taking the microservices approach that divides the app and different parts of the apps into services which helped the company scale, because a lot of things that were deployed needed a lot of customisation. Debugging and training also became easier with this approach.
A step ahead from others
Priyadarshi believes that even though there are many people who're building solutions for talent acquisition, his goal is to achieve internal employee optimisation. He further believes that the following gives him (and Edge Networks) an advantage over any other company/product:
1. Edge Graph - It lets the company look at the current and future market to understand the entire skill landscape. The NLP engine and the Edge Graph work in tandem to drive the understanding of JDs and profiles.
2. Dealing with enterprise data gives the company a lot of insights which resumes usually don't.
3. The modern architecture adding to the capabilities - Use of serverless functions, handling scale and ability to make changes quickly are all possible because of the company’s modern architecture. Priyadarshi adds,
We kept security and enterprise readiness on our priority list while building the architecture, because we deal with very sensitive data. Even the data which goes on NLP engine is anonymised before. All the deployments are based on client's companies. We run more than 600 penetration tests on our systems every month.
4. Deep Learning - Priyadarshi decided to move on from regular machine learning to deep learning almost two years ago. This enables Edge Networks to run smaller databases through larger number of iterations and get maximum learning out of it in much shorter time.
How to hire right
Priyadarshi joined Edge Networks as Principal Engineer, became head of engineering later, and is now its CTO. He continues to look at larger technology decisions, making sure that there are checks and balances at the right places. Through the years, he has built a robust tech team as well. While hiring, he looks for the following qualities in a techie:
- The ability to learn - The engineers working at the company come up with a real-world problem to solve. The candidate works on its solution which then goes through a peer review of how well the solution is crafted.
- Knowledge of what's happening in his/her technology landscape today.
- If he/she is a good fit for work at a startup.
Issues at hand and the future
While the mission of Edge Networks is to get the right people for the right job, Priyadarshi would like the system to evolve to guide and mentor the individual (job seeker) to search for better opportunities and learnings.
And as far as his love for literature is concerned, it has now been replaced by technical literature mostly. Movies have taken the place of books. During his college days, the Head of Department of English told Priyadarshi something which he still remembers today — "If you've to truly appreciate a work of fiction, there has to be a willing suspension of disbelief."