Everybody has heard of how Big Data is going to transform every industry – but businesses still remain sceptical to invest time and money to hire a data science team in-house. Dextra is demonstrating how businesses can harness data science – without having to put data scientists on a payroll! The key lies in crowdsourcing.
DEXTRA is an online data innovation challenge platform, which allows companies to get access to a community of data scientists. Dextra was created through a collaboration between Infocomm Development Authority of Singapore and Newton Circus (a Singapore-based technology startup). Companies host data-driven challenges on DEXTRA, and open up their data sets to the online community of data scientists. The data scientists will compete to find the best solutions to these challenges and the winners will get a cash prize from the host company. Data scientists are driven by the opportunity to get access different data sets and knowing that their algorithm could end up being incorporated into the host company’s business.
If your company would like to test the powers of Big Data by starting small, read on and find out how it all works. Joelle Pang (Project Director for DEXTRA) tells us more about the journey so far and what lies ahead for this exciting new startup!
YS: How did DEXTRA come into being?
Joelle: Newton Circus organizes a hackathon platform called UP Singapore which allows developers and subject matter experts to come together over a weekend hackathon to create solutions for complex urban problems, while using data sets from government agencies and private sector organizations. We realized that the data scientists in the community were really interested in the data sets that were being liberated from public and private sector organizations. However, after each hackathon, there's no longevity to the data sets released and the data partnerships made over the hackathon would end with the weekend.
Initially, the idea was to have an online Data Exchange where the data sets could be accessed by researchers, institutions and enterprises 24/7 and so that they could use it to enhance their research or just have fun with the data sets. From that idea, it evolved into a challenge platform. We thought: “If we have 1000+ data scientists who are closely engaged with us, what's the best way to harness the full value of this community of data scientists?”
All companies know that they need to use data to make the right business decisions and get ahead of their competitors. We noticed that there is a huge crunch in data science talent, especially in a small place like Singapore. Companies typically would like to engage data scientists, but don’t have the budget to hire them full-time. Data scientists are hard to come by and most of them happen to hold full-time jobs with research institutions or companies. So, we set out to bridge this gap by utilising the cognitive surplus of the data science community to serve the needs of companies – at a fraction of the cost.
YS: How do the companies come up with the challenge statements?
Joelle: The inspiration for the challenge typically comes from the challenge host, because they are the ones with the most intimate knowledge of their business and the challenges they are facing – whether it is sales forecasting, R&D or operational efficiency. DEXTRA provides consultation to companies to bridge the gap between the management and data scientists. Data scientists typically encounter senior management who want solutions immediately, or in time frames that may not be realistic. The challenge statements also need to be crafted appropriately, so that the data scientists have all the information they need to create accurate algorithms. To ensure the quality of the challenge statements, we engage a closed group of experienced data scientists to review challenges before they go live.
YS: How has the experience been so far?
Joelle: The Data Innovation Challenge initiative is a collaboration between Newton Circus and the IDA. We are very thankful to have the support of such a prolific organization in the public sector, because it has helped open a lot of doors for us, especially with other public sector agencies. They are more willing to help us, since we are backed & endorsed by the IDA.
Singapore is a great gateway and test-bed for startups, since many people come here to explore opportunities. The government is making a big push for data analytics and more liberal sharing of data. So, it’s good to know that DEXTRA is aligned with the government's vision of data science in the coming years. As a macro-economic factor, it's on our side for sure.
As with every start-up, the beginning was not easy – we are exploring different business models, clients and new ways of engaging our community. But the progress so far has been very encouraging.
We currently have a community of over 700 Data Scientists, in just 6 short months. We also recently concluded our 1st major challenge with DSM Engineering Plastics. DSM is the world’s 3rd largest engineering plastics producer with revenue of over 10 billion Euros per year. They hosted a Sales Forecasting Challenge with DEXTRA for their engineering plastics business.
YS: What was the experience like working with an MNC like DSM?
Joelle: It was great working with the DSM team, because even though they are working in a large company, they have a very entrepreneurial spirit and they are willing to take the chance with something so cutting edge. We have been working with them very closely for the last few months. At first, there was a lot of red-tape, but we have managed to work through it and successfully concluded the challenge. We are currently rolling out the next stage.
The 3 winning teams will work with DSM to come up with models and algorithms that can be incorporated into DSM's business. Beyond just implementing the current solution, they are looking to see how this way of crowdsourcing data science solutions can go global within DSM.
It is a good success story, and it has opened up more opportunities to get more companies excited about the prospects of exploring a new way to achieve their goals. Crowdsourcing has been a trend which has been getting popular over the last 1-2 years. It's not something that hasn't been tried and tested, but there are a lot of success stories that have come out of it. This challenge with DSM has helped to show that it’s possible for large companies to implement crowdsourcing to achieve their results.
YS: Was DSM pleased with the winners’ solutions? How cost effective is it for companies to run challenges as opposed to hiring their own data scientists?
Joelle: DSM was very pleased with the solutions from the community. They felt that even before implementing the solutions into their business, they had the chance to learn about the high impact variables which affect their sales forecasts. During the challenge, all the data sets were anonymized to protect DSM’s trade secrets, but after seeing the algorithms, the DSM team has decided to release non-anonymized data sets to the 3 winning teams and work closely with them to help them create more accurate forecasting models.
The value for the challenge host is very clear. One of the winning teams was a team of 6 from one of Singapore’s premier research institutes, A-STAR. The team told us that they spent about 2 man-hour weeks to create the solution for the challenge. Hiring data scientists of that calibre for 2 weeks would easily cost companies between SG$ 20,000 – SG$ 30,000.
YS: How is DEXTRA different to Kaggle?
The main value proposition is very different. Since we are Asia-based, it allows us to be flexible enough to have both offline and online challenges. We want our clients to be able to have physical interaction with our community of data scientists. The shortlisted teams from the DSM challenge got the chance to meet the senior leaders of DSM, and to share the ideas with them. If everything was based online, you'd never know if people were cheating. Through physical interaction, you can know more about their calibre of each participant.
We are also quite selective of our challenge hosts, because each challenge host gets our commitment and time to come up with the best challenges. There's also a close group of data scientists we work with to vet the challenge before it goes live. It's a quality assurance we offer to each of our challenge hosts that sets us apart from Kaggle.
YS: What’s the big audacious goal for DEXTRA?
We are at the stage where we are looking for collaborative partners, who share our passion for liberation of data, as well as data science in general. Over the next few months, we are looking forward to hosting a few more big challenges.
We want companies big and small to come to DEXTRA to explore data science as a way to take their business to the next level. We want it to be an inclusive platform for NGOs, SMEs and startups who may not have that much data, but who can see DEXTRA as their go-to platform if and when they reach the stage where they need to consider data analytics more seriously.
Another goal: As Big Data becomes more mainstream, we want to use our community to educate people outside the community about big data and help them understand how it can be used for businesses as well as social good. One of our community’s researchers was telling us about how he wanted to come up with exposure modelling for outdoor pollutants, so that he can help the government time their citizen messages during Singapore’s annual Haze crisis. We are very excited to see more such solutions for social good coming from DEXTRA’s community as well.
YS: Can our readers based in India participate or host challenges on DEXTRA?
Joelle: We welcome participants from all over the world – that’s the good part about having an online challenge platform. We are open to working with challenge hosts outside of Singapore as well. If companies overseas who want to work with us, we are probably at the level of readiness where we just need to send them next steps and take it forward. As for participating data scientists, we had a team from Vietnam for the DSM challenge. They even got to present to the managers via conference calls, and everything went smooth. It gave us a glimpse of what the future holds for us and it’s very exciting!
YS: What are some of the challenges you are facing?
When you're working with a cutting-edge concept, it is exciting and interesting - but people take some time to get convinced and put money where their mouth is. Getting buy-in from senior management is another hurdle we constantly face. Once it's successful, everybody agrees that it was the right thing to do, so it comes down to us to finding ways to convince the top management that crowdsourcing data science talents is the most cost-effective way of harnessing the power of data science. We are the pioneers in this space in Asia, so we know that this is a journey we have to undertake.
Want to know how your business can start small with Big Data? Visit www.dextra.sg for more information!