Project name:

DSW: Cloud platform for collaborative R&D work of Data Science team

Project description:

DSW logoService summary: Data Science Workspace (DSW) is cloud platform for Data Science teams and individuals with following abilities:

  • Change computing performance and libraries in one click, without leaving Jupyter Lab/Notebook
  • Monitor sessions and resource usage
  • Save results into project network folder with access control

It covers R&D stage, supports pay-as-you-go approach. In near future DSW would support distributed computing (SPARK) and container builder for custom ML libraries set.

Benefits for companies:

  • Increase Data Science productivity with ready to use workspace
  • Reduce data loses and leaks by saving all data-marts and research results in network folder with access control
  • Ability to use expensive high-end servers on demand
  • Monitor team performance in terms of resource usage

Target audience:

  • Companies that have Data Scientists and data for research.
  • Students of universities and Data Science education trainings.
  • Individual Data Scientists who want to start scalable research and avoid unnecessary DevOps.

Market: Global cloud analytics market is $8,3B with growing GPU-as-a-Service approach for Deep Learning.

Technologies and approaches: Web portal, integrated with Jupyter Lab / Notebook at front. Many techs under, including VMWare, Kubernetis, Docker, Jupyter Hub, GitLab, Anaconda Distribution.

DSW Monitoring interface example

Readiness level: Produced technical release and started sales.

Description: We saw global demand grow in prediction models development, using machine learning. Production and insurance companies joined to telecom and banking.

Since Machine Learning (ML) is a new way of programming complicated logic, companies need tools, servers and qualified DevOps to support ML-engineers. Another challenge was computing resource monitoring and efficient usage – there was no simple tool for that. Therefore, we decided to automate most of these processes in order to speed-up researches.

Development phases: This service we built from the ground up. After market analysis, when basic investment budget was calculated and approved, I provided about 16 customer in-depth interviews with data engineers and their managers, university teachers, hackathon managers in order to understand their needs and find frequently requests. As result, I created product backlog with about 27 product features combined in problematic groups.

In a month, we developed demo version for internal presentation. Next month we developed MVP for AI conference.

Technical release took about 4 months due to technological complexity. Taking into consideration that we worked in large slow system integration company, our team ran fast.

In parallel with technical release production, I built support SLA process and developed sales-kit with pricing, benefits, comparison with competitors. Then I started sales enablement and first presales.

Product challenges: We had to improve open source software in order to close security vulnerabilities.

If company’s data is not in cloud, they would like to move it to the cloud first and requires prerequisites like S3 compatible storage, basic self-service automation and billing. Cloud provider should automate basic data services first.

DSW almost passed a Product Discovery stage, from idea to MVP and first sales. During customer research, I discovered another segment of companies that asked for ready to use industry and case specific machine learning models. So the MiGA product appeared.