Razorpay is India’s first converged payments solution company which aims to revolutionize online payments by providing clean, developer-friendly APIs and hassle-free integration. Founded in 2014 by IIT-Roorkee alumni, Harshil Mathur and Shashank Kumar, the company provides businesses with comprehensive and innovative solutions built on robust technology to address the entire length and breadth of their payment journey.
The company’s merchant count currently stands at over 1 lakh and is geared to increase to 2 lakh by the end of 2018. Their vision is to impact 500 million businesses by 2020.
Problem Statement: Razorpay was looking for a better log analytics platform that would give them real-time data insights and help them be more proactive in both monitoring and troubleshooting modern applications.
In the early days, Razorpay was keen to showcase its products to the market very quickly. They picked up a log analytics product, one that was easy to use, and allowed Razorpay to slice and dice data in many ways. Unfortunately, it was too expensive, and the founders realized that at the pace at which they were growing at that time, it would not be a sustainable option.
What they had in mind was a solution that was agile, flexible, infinitely scalable, cost-effective and easy to adopt. They also wanted to get to the expected benefits reasonably quickly.
Moreover, Razorpay was growing rapidly and had to consider the semantics of the data to ensure that whatever they built today would work well for the next few years. When a problem occurred in the system, it was important to identify which day, country, region it occurred to fix it quickly. Detecting the IP address from where the error occurred would help drill down to the bottom of the problem.
In other words, Razorpay wanted a solution that would easily digest, process and improve data as and when required. Their earlier experiences had taught them that migrating critical data quickly was the need of the hour.
Two years ago, Razorpay used a combination of their own ELK solution and a third-party log management solution that covered their security and compliance requirements, digested their logs and provided multiple dashboards. Although effective, it was quite expensive and not as efficient. Just when they were looking for a partner who could ride along with them and foresee the future, SumoLogic came along.
SumoLogic is a secure, cloud-native, machine data analytics platform, delivering real-time, continuous intelligence from structured, semi-structured and unstructured data across the entire application lifecycle and stack.
They successfully met the yardstick Razorpay had in mind - Sumo Logic’s solution was cost-effective and did not compromise on two critical features - the time to detect an issue and response to the query was much quicker, and they didn't have to dedicate too many resources towards solving a problem. With Razorpay’s earlier solution, they required bandwidth, had to manage it by themselves and there were overheads in migrating 10-15 GB in a single day. On the other hand, Sumo Logic provided IP Intelligence, IP Detection, IP to Lat-long Translation as value-added features that are inbuilt into the platform and are not charged for separately.
SumoLogic simplifies how you collect and analyze machine data to get the insights needed to drive the best customer experiences. With the Sumo Logic platform, users can accelerate modern application delivery, monitor and troubleshoot in real-time and improve their security and compliance posture.
Multiple dashboards: It drills through the data and provides various dashboards, part of which falls into compliance and the rest, forensics. Dashboards are maintained for each application in the process such as error rates, error codes, infrastructure logins, CPU memory, iostats and more which can be accessed by different individuals or teams in the system.
Problem identifier: Navigating through multiple servers, logs, and stacks is a daunting task for any individual. Being in a regulated space, most individuals don't even have access to the system. Sumo Logic gives a reasonably consistent picture of what went wrong through logging, identifying various patterns, studying the problem and boiling it down to the exact cause.
Threat intelligence: SumoLogic can correlate log data with known IOCs and match that log data with threat intelligence data to identify and visualize malicious IP addresses, domain names, email addresses, URLs, MD5 Hashes, and alert Razorpay the moment there is an intrusion in the system.
Geo-intelligence: An important feature is that it slices and dices the data across various geographical locations for better analysis.
Multi-dimensional users: It offers the best experience to users spread across different teams including developers, IT operations and DevSecOps with regard to analyzing data for a particular application.
Commenting on how Razorpay benefitted from partnering with Sumo Logic, Raju Shetty, Razorpay's VP of Engineering, says,
"Transitioning to the SumoLogic platform from an on-premise solution was a breeze. We've been using Sumo Logic's solution for a few months now and the platform continues to proactively help us address issues in our applications. It provides complete visibility to our devops and analyst teams, assisting in alerting and monitoring our critical business serving applications."
Razorpay started migration towards the end of April 2018 and roughly routed 40-50 GB/day of IP intelligence and other critical datasets, to begin with. At present, a lot of compliance-related logins are taking place. Dashboards and application insights have been built to serve their day-to-day requirements. With Sumo Logic's support, retaining old datasets and dashboards has been a seamless process which took place in two weeks, but ingesting the multi-terabyte into the platform will take some time.
If an issue occurs in the system, SumoLogic's solution analyzes various systems, monitors each layer in one shot, helps them connect the dots and find the crux of the problem. IP addresses that pose a threat are also marked and filtered, reducing the chances of the error being repeated. This translates to quicker detection and response time, and Razorpay was able to increase their productivity to 20 percent.
Currently, Razorpay has around 18 applications, infrastructures, web servers and databases that need to be migrated and they're focusing on migrating the critical ones immediately. For most organizations, planning, execution, and migration takes a month or two, but for startups, this is a reasonable time, given the multiple applications they need to test.
Booking a ticket on IRCTC, paying online for food-delivery, settling your electricity bill – for a consumer, the ease with which they can perform these transactions means that they don’t give a second thought to the process which enables them to do this. But in the backend, a single transaction is looped through several applications. If a latency or timeout occurred in the life cycle of a transaction, and if a login system didn't exist, the developer at Razorpay would be clueless as to what went wrong. With SumoLogic, he or she can access a central login system, receive transaction details and based on the login and identifier, analyze what part of the infrastructure did not perform or respond. In other words, troubleshooting becomes much easier.
For Razorpay, it's quite early to quantify metrics as IP intelligence needs to seep in. Until now, the journey has been quite smooth. They were able to get to this stage fairly quickly and look forward to phases 2 and 3 where other dashboard elements are yet to be enabled.
With SumoLogic, Razorpay has improved processes and systems and increased business value.