amazon sagemaker을 통한 손쉬운 jupyter notebook 활용하기 - 윤석찬 (aws...
TRANSCRIPT
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
�
� @
�
( )
����
Nt ) �2 J
- AG Tz g gac
X omu• 1 CC lM o
• 3 C 23 e kfr i �
• 3 C . C s pmw
• 4 � ygzSIc b
( MK nP
!
http://hunkim.github.io/ml/
F R A M E W O R K S A N D I N T E R F A C E S
����������� ����
P3
P3 Instance Deep Learning AMI
Frameworks
PLATFORM SERVICES
VIS ION LANGUAGE VR/ IR
APPLICATION SERVICE
AWS DeepLensAmazon SageMaker Amazon Machine Learning Amazon EMR & SparkMechanical Turk
A W S D E E P LE A R N IN G A M I
Apache MXNet TensorFlowCaffe2 Torch KerasCNTK PyTorch GluonTheano
IN S T A N C E S
GPU (G2/P2/P3) CPU (C5)
NVIDIATesla V100 GPU
5,120 Tensor cores 1 Petaflop
128GB of memory NVLink 2.0
14X faster than P2
!
()
-
J N)( (( �
-
J N)( (( �
K-Means ClusteringPrincipal Component AnalysisNeural Topic ModellingFactorization MachinesLinear Learner - Regression
XGBoostLatent Dirichlet AllocationImage ClassificationSeq2SeqLinear Learner - Classification
ALGORITHMS
Apache MXNetTensorFlow
Caffe2, CNTK, PyTorch, Torch
FRAMEWORKS
/ . ..
••
,
: / / .
C A D
,65�.88 �387 9�,41 e�t g
2 8 H DM55 �2 8 H u
EWLp a
L yE n L r
2 8 t g,65� in 2 8 H S E,65A C IE L r
2�+2� � � � 2 2 �
JE C
Discrete Classification,Regression
Linear Learner Supervised
XGBoost Algorithm Supervised
Discrete Recommendations Factorization Machines Supervised
Image Classification Image Classification Algorithm Supervised, CNN
Neural Machine Translation Sequence to Sequence Supervised, seq2seq
Time-series Prediction DeepAR Supervised, RNN
Discrete Groupings K-Means Algorithm Unsupervised
Dimensionality Reduction PCA (Principal Component Analysis) Unsupervised
Topic Determination Latent Dirichlet Allocation (LDA) Unsupervised
Neural Topic Model (NTM) Unsupervised, Neural Network Based
-
J N)( (( �
-
HJ
�
N
�
�
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
Factorization Machines
Linear Learner - Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner - Classification
BUILT ALGORITHMS
Caffe2, CNTK, PyTorch, Torch
IM Estimators in Spark
DEEP LEARNINGFRAMEWORKS
Bring Your Own Script (IM builds the Container)
BRING YOUR OWN MODEL
MLTraining code
Fetch Training dataSave Model Artifacts
Amazon ECR
Save Inference Image
Amazon S3
�
https://nucleusresearch.com/research/single/guidebook-tensorflow-aws/
In analyzing the experiences of researchers supporting more than 388unique projects, Nucleus found that 88 percent of cloud-based TensorFlow projects are running on Amazon Web Services (AWS).
“
�� ������from sagemaker.tensorflow import TensorFlow
tf_estimator = TensorFlow(entry_point='tf-train.py’, role='SageMakerRole', training_steps=10000, evaluation_steps=100, train_instance_count=1, train_instance_type='ml.p2.xlarge’)
tf_estimator.fit('s3://bucket/path/to/training/data’)
from sagemaker.mxnet import MXNet
mxnet_estimator = MXNet("mx-train.py",train_instance_type="ml.p2.xlarge",train_instance_count=1)
mxnet_estimator.fit("s3://my_bucket/my_training_data/")
-
HJ
�
N
�
-
HJ
� �N
�� �������predictor = tf_estimator.deploy(
initial_instance_count=1,instance_type='ml.c4.xlarge')
predictor = mxnet_estimator.deploy(deploy_instance_type="ml.p2.xlarge", min_instances=1,
https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/model-name/invocations
�
• BK � �A ID A• � � � �A � �I
�� �������
SageMaker Notebooks
Training Algorithm
SageMakerTraining
Amazon ECR
Code Commit
Code Pipeline
SageMakerHosting
Coco dataset
AWS Lambda
APIGateway
Build Train
Deploy
static website hosted on S3
Inference requests
Amazon S3
Amazon Cloudfront
Web assets onCloudfront
� -� �
•• (6 �- ma wC t
k zm h
• , s Sc A 5 5�5� r
• t h v• 6 t ma ,
r n t v
• ) 5 2 ue h t �Ntl i Ntl �
t h s M v
• r oR r g n g
sagemaker = boto3.client(service_name='sagemaker')
sagemaker.create_training_job(**training_params)
create_model_response = sage.create_model(ModelName = model_name,ExecutionRoleArn = role,PrimaryContainer = primary_container)
endpoint_config_response = sage.create_endpoint_config(EndpointConfigName = endpoint_config_name,ProductionVariants=[{
'InstanceType':'ml.m4.xlarge','InitialInstanceCount':1,'ModelName':model_name,'VariantName':'AllTraffic'}])
endpoint_response = sagemaker.create_endpoint('EndpointName': endpoint_name,'EndpointConfigName': endpoint_config_name
����������
2
.
�1 3
.
- �
•• w I n �
l 9 � l NFl
l T
• C x N S• , �C e ,• 03 2 � x
oMs r C• N S• C 8
-
• S• n aN eb
D bFDo
• o A b• 13 � �, � K, D
• 34 137 4 �o T l rC
• A•
�
••••
1
4.75
8.5
12.25
16
1 4.75 8.5 12.25 16
Spee
d up
(x)
# GPUs
Resnet 152
Inceptin V3
Alexnet
Ideal
��� ������ ���
P2.16xlarge (8 Nvidia Tesla K80 - 16 GPUs)Synchronous SGD (Stochastic Gradient Descent)
91% Efficiency
88% Efficiency
16x P2.16xlarge by AWS CloudFormationMounted on Amazon EFS
# GPUs
## train data
num_gpus = 4gpus = [mx.gpu(i) for i in range(num_gpus)]model = mx.model.FeedForward(
ctx = gpus,symbol = softmax,num_round = 20,learning_rate = 0.01,momentum = 0.9,wd = 0.00001)
model.fit(X = train, eval_data = val,batch_end_callback = mx.callback.Speedometer(batch_size=batch_size))
��� ������ ���
기반예제 � �
B : A I A AA• ( � A B
• . DD � A � DD � A B A• A � A � IBD A � AD� D � AD• -A � D � AD � D :� � D
• BB A• -. �D A: � D � :• /D � �BD D � A• - � AD� C � D :• A D�• D � � �A � � D �)A B D� A � A �)..
http://mxnet.io/https://github.com/dmlc/mxnethttp://incubator.apache.org/projects/mxnet.html
http://gluon.mxnet.io
-
H• ,X P b fd S
• ( C X g NTMI ce
• ) A ) A� A �A K a W
• A �,C � � C a
� / -
•• X
MN
• e G 3• N S
• / �e i
• M G•
���� ��������� ���������
We plan to use Amazon SageMaker to train modelsagainst petabytes of Earth observation imagery datasetsusing hosted Jupyter notebooks, so DigitalGlobe'sGeospatial Big Data Platform (GBDX) users can just push abutton, create a model, and deploy it all within onescalable distributed environment at scale.
- Dr. Walter Scott, CTO of Maxar Technologies and founder of DigitalGlobe
EC
: A C
�
�
�
“With Amazon SageMaker, we can accelerate our Artificial Intelligenceinitiatives at scale by building and deploying our algorithms on theplatform. We will create novel large-scale machine learning and AIalgorithms and deploy them on this platform to solve complexproblems that can power prosperity for our customers."
- Ashok Srivastava, Chief Data Officer, Intuit
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
Single Machine
Distributed, withStrong Machines
���������
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
EC2 + AMI
Amazon SageMaker
On-premise
���������
. 2 �) ( 1
G) : - h U 9 8 ).0 h U G) : - h U
8 )/ h U 9 ) 8 .- h U 8 )/ h U
D) 8 ( )- h U D* ) 8 )/ h U D) 8 ( )- h U
wU ( od kx (. pg gz kx ( pg gz
m 553 �g n ( 42U -62
o e i (- 42U (- 42U
3A• S apg g1 G) : cm ) h• 1� 8 �cm h• g 1 8 �cm () h
� T s u 753�l ) (/ )n ril t 1 GGD 1 8 8 8 B 9 8 8 D 9 B
. � 6 3
70�r z d h ( ,w B ) = J �6J =G n�h ay * o B B:G = s �h ( o
B ) = J � n *53 )53d 2 : D�8* �(53 g � c U ( (
w n a w p Q
( , B ) = J /.)
) B B:G = ( () ) )
. B ) = J , *,,
tuxp
� (�1� 53 � (�1� 53 . (,,,,
�(�1� 53 �(�1� 53 53 ( ( /
2 : D�8*�p *53 /
�iS m k z e U 84� ) (/ )x l0 0 : : : D A : = :A=G G D
����
NK : - : / /
D• : : - : / : / /
B D � : . -: : - : / / /: . :
• / : - : : : / / / /:
• � � : - : : / / :.