Microsoft Azure

Microsoft Azure

Learn how to build a fast, scalable data system on Azure Hyperscale (Citus) and Cosmos DB

Microsoft Azure’s technical specialists Brain McKerr and Srikant Sridhar discuss how fully managed data services support high-performance applications to be agile, scalable, and responsive in Microsoft’s latest webinar on hyperscale database.

Learn how to build a fast, scalable data system on Azure Hyperscale (Citus) and Cosmos DB

Friday April 01, 2022,

4 min Read

While consumers demand organisations to be extremely responsive, agile, and data-oriented, businesses need applications to be deployed in data centres nearby their users. Therefore, it is essential for the applications to respond in real-time, store exponentially rising data pools, and present data to users in a fraction of seconds.

As data volume and data consumption requirements grow, the database server must be scalable in terms of computing and storage. Fully-managed data services can power these workloads in milliseconds as response time and guarantee scalability and speed.

To understand the potentiality and high scalability of most advanced open-source databases, Microsoft and YourStory jointly hosted a webinar titled ‘Build high-performance apps with limitless scale on hyperscale DB’ with panelists Brian McKerr, Technical Specialist for open source database, Microsoft Azure and Srikant Sridhar, Senior Specialist - Azure Cosmos DB.

Advance apps performance with Postgres on Azure DB

“We have three deployment options of Postgres on Azure – a single-node database with advanced security and availability, a flexible server (a single node Postgres DB) accounting for enterprise feature and performance, and Hyperscale Citus ( a scale-out distributed Postgres DB) to scale thousands of node and petabytes of disk storage,” said Brian.

Discussing the key features of Citus, Brian indexed managing huge infrastructure with security backups, high availability, disaster recovery, integration, scale, and performance.

He virtually demonstrated Citus’s working on how high-performance, high scale apps can deploy capacity and resources with connection pooling to store terabytes to petabytes. He talked about Azure Data Ecosystem comprising Azure Data Factory, Azure Functions, and Azure App Service for easy deployment and automated scripted options using SQL queries.

“Solutions are offerings where the entire database is distributed. And, depending on your application we give the flexibility to choose an appropriate model. We can rebalance data live non-disruptively to manage movements of shards from nodes to new nodes allowing us to spread the data uniformly. Also, deploying a tenant isolation feature allows to dedicate specific tenants a ring-fenced resource capability,” added Brian.

Hyperscale (Citus) on Azure Database

According to Brian, Hyperscale (Citus) is a built-in option in the Azure Database for PostgreSQL managed services that focuses on application performance rather than spending time on managing databases.

“One feature of Citus 10 for analytics workloads is Columnar Compression to store data in tables based on columns. Another feature is Read Replicas, asynchronous replication of your database to offload analytic workloads by copying data,” he added.

Among many features, Brian highlighted a managed pgBouncer to scale massive connections, a capability to configure custom maintenance windows to provide full control when performing platform maintenance, easier shard key execution command, and MX (distributed metadata with linear performance improvements) enabling applications and users to connect to any node in a cluster than approaching via the coordinator.

He briefed some use cases and scalability challenging projects that proved the simplification of system architecture, breaking skill barriers, and performing business expectations.

How to select the right API in Azure Cosmos DB

“Cosmos DB is the fastest available NoSQL database available in the market, it is a multimodal NoSQL database and can support several open APIs for your workload needs. Cosmos DB can be used as a document-based database, columnar based database, graph database or key-value pair database as per requirements,” noted Srikant.

Cosmos DB offers a latency of fewer than 10 milliseconds, a limitless scale as storage to build high-performing scalable applications with availability across many regions. With respect to application development and technical community, Cosmos’s SDKs support major programming languages such as .NET, Java, Python, PHP, and Nodejs.

“This is highly scalable to handle big data workloads. With the high availability and geo-replication feature, you can replicate data to any part of the world seamlessly, attract users from different geographies, and have a turnkey global replication capability,” he added.

According to him, Cosmos DB has extensive applications across several industries; for example, e-commerce, supply chain, financial sector, retail, telecommunications, gaming, and many more.

Key differentiators such as flexibility, scalability, real-time analytics of data stored on Cosmos DB at minimal cost, predictive maintenance, consistency, availability, etc enable Cosmos DB an effective product to meet the requirements of building any scalable application.