apex-goliath - nstda

Post on 30-Dec-2021

12 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Apex-GoliathNational AI & Data Science Research Infrastructure

What is Apex-Goliath?

Apex-Goliath is a federated AI experimental research and development infrastructure for Thai universities, researchers, students, and startups. It will provide services related to AI research including data storage and computer clusters for AI, High-Performance Computing (HPC), and Cloud services.

Apex-Goliath Services

● Federated computing platform with regional nodes○ For university, student, and startup research projects

● Experimental hybrid AI/HPC/Cloud infrastructure○ AI model experiments and training○ Data lake infrastructure and sharing platform○ Inference edges & services

● Commodity + web-scale architecture meets HPC● Resource sharing + pilot services

3

UniversitiesResearch Centers

Startups

Interactive/Batch Experiments& Regional Data Collectionon federated AI R&D infrastructure

Related Platforms

4

Apex — AI-Focused TasksAI, Deep Learning, Inference,ASR, Video/Image processing

For universities, research centers, consortiums, startups

ThaiSC — HPC-Focused TasksSimulation, Genomes, Aerospace, Military

For national research institutes & government research

GDCC — MoDEGovernment data, applications, & services

For government services & public organizations

Scale Out

Rollout successful pilot to GDCC or Private Cloud

High-compute Tasksbatch to ThaiSC

Scale Up

Mixed data & compute-intensive workload

Mixed

Target University & IndustryResearch Domains

● AI for Health & Medical Security● AI for Food & Agriculture● AI for Local Tourism & Economy● AI for Entertainment & Creative Technology● AI for Finance / Logistics / Businesses● Etc.

5

ApexAI Research Platform & Data Exchange Infrastructure

Hybrid AI/HPC/Cloud infrastructure

Apex & Goliath Platform Structure

6

Goliath AI & Data Analytics Platform

Data journalism / sharable research datasets / pre-trained models & notebooks / infrastructure accessCommon

Schema for

Research Datasets

Applications● Disease control,

Social distancing● Travel, Retail, &

Tourism● Food Safety● Etc.

Data

Precision Agriculture/Aquaculture

IoT & Smart Buildings/

Energy

Health & Wearables

GIS/ Logistics/

Retail Map

Common Thai-context Corpus

(Tagged Speech/ Image/Texts)

Apex AI Research Infrastructure

● R&D Infrastructure○ Hardware Infrastructure:

High Performance Computing & Storage System Prototype

○ 1,536 logical CPU cores○ 30 petaFLOPS AI (48 x A100 gpus)○ 3 petaBytes storage○ 200 Gbps HDR Interconnect

● Funding Agency: PMU-C (BCG Digital) / NXPO / MHESI

7

8

ApexSystem Architecture

GoliathData Exchange Platform

Goliath allows AI researchers and engineers to share datasets with other researchers, helping them find the datasets they need as well as promoting the use of previously siloed, underutilized—but valuable—datasets.

Powered by Apex, Goliath also let users tap into Apex’s high-performance computing power for their AI model training needs.

9

Goliath Features

10

Share Open Datasets With the Community

High-Performance Computing With Less Data Transfer

Control Who Has Access to Your Dataset

Use existing datasets on Goliath and connect them to their Apex-powered AI projects without having to upload your own copies of the dataset for every projects.

With multiple levels of permission types on Goliath, you can make your datasets publicly available, available upon request, or only share them with specific people.

Goliath provides a platform for individuals and organizations for creating open datasets, giving more value to the previously siloed, underutilized datasets.

Common Open Datasets to Become Available on Apex & Goliath

11

● COCO (Common Objects in Context)● Mozilla Common Voice● Thai local dialect● WordNet● ImageNet● and more...

Common Schemafor Research Datasets

12

To make the most out of a diverse set of datasets from multiple sources, we also propose “CMKL Common Dataset Schema Standard version 1.0,” to make merging different datasets easier for researchers.

AI Research Global Partnership

● Technology providers Nvidia, Data Direct Network● Academic Institutions Carnegie Mellon University

For more information: https://www.cmkl.ac.th/apex

13

top related