Deploy an Auto-Scaling HPC Cluster with Slurm
Description
Chances are that you have already used one of the many Google services (eg: Search, Gmail, Chrome, YouTube) with over a billion users each. Google Cloud is built on the same robust global infrastructure that securely and reliably delivers these services to the world. We offer a complete platform addressing the compute, storage, networking, security, and AI/ML needs of a successful cloud enhanced EDA strategy. One of the basic services of a cloud is infrastructure as a service (IaaS). In this service, customers customize the infrastructure they need, acquire it almost instantly when needed and have the capability to discard it when not needed. IaaS provides customers with best-in class compute when they need it – with virtually no limits. Google Cloud is one of the prominent cloud providers with the industry’s best IaaS offerings (described more in ‘Compute’). Cloud also offers several types of storage. It is a common misconception that storage on cloud is expensive. On the contrary, cloud offers several tiers of storage, where correct selection of storage for tasks can make the total cost of storage very reasonable for most data center needs. Google Cloud offers low latency storage solutions that can meet the requirements of typical chip design customers. AI and ML are ubiquitous in every industry, and every company is looking to leverage its data such that it can increase its impact. Google Cloud’s Data Management and AI/ML solutions are the most comprehensive in the industry with out-of-box support for several types of data analytics and ML models to help customers build data driven solutions quickly.
In the end of course you will will be able to able to:
- Creating Virtual Machines
- Create several standard VMs
- Create advanced VMs
- Cloud Filestore: Qwik Start
- Create a Cloud Filestore instance.
- Mount the fileshare from that instance on a client VM instance.
- Create a file on the mounted fileshare.
- VPC Networking Fundamentals
- Explore the default VPC network
- Create an auto mode network with firewall rules
- Create VM instances using Compute Engine
- Explore the connectivity for VM instances
- Deploy an Auto-Scaling HPC Cluster with Slurm
- How to setup up a Slurm cluster using Terraform
- How to run a job using SLURM
- How to query cluster information and monitor running jobs in SLURM
- How to autoscale nodes to accommodate specific job parameters and requirements
- Where to find help with Slurm
Who this course is for:
- University students/professionals familiar with EDA workflows using LSF/UGE/Slrum/HPC who want to learn cloud deployment