MLAgents - RollerBall

MLAgents - RollerBall

2020. 7. 14. 16:32ㆍUnity/수업내용

1. 유니티 프로젝트 생성

2. ML-Agent 환경설정

메뉴 > Windows > Package Manager > "ML" 검색 > ML Agents 인스톨

3. Plane만들기 ("Floor") (0, 0, 0)

4. Cube만들기 ("Target") (0, 0.5, 3)

5. Sphere만들기 ("RollerAgent) (0, 0.5, 0)

6. RollerAgent에 RigidBody컴포넌트 추가

7. 빈게임 오브젝트 생성후 이름을 TrainingArea라고 하고 Floor, Target, RollerAgent자식으로 추가

8. RollerAgent.cs 파일 생성

9. 네임스페이스 추가

1) using Unity.MLAgents;

2) using Unity.MLAgents.Sensors;

10. Agent상속

11. override Initialize OnEpisodeBegin, CollectObservations, OnActionReceived

12. 스크립트 작성 (RollerAgent)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
 
public class RollerAgent : Agent
{
    public Transform target;
    private Rigidbody rBody;
    public float speed = 10;
    private void Start()
    {
        this.rBody = this.GetComponent<Rigidbody>();
    }
    /// <summary>
    /// 각 에피소드가 시작될 때 새 에피소드의 환경을 설정하기 위해 호출되는 메소드
    /// </summary>
    public override void OnEpisodeBegin()
    {
        // 플랫폼에서 떨어졌다면
        if (this.transform.localPosition.y < 0)
        {
            // 속도를 0으로
            this.rBody.angularVelocity = Vector3.zero;
            this.rBody.velocity = Vector3.zero;
            //위치를 초기 위치로
            this.transform.localPosition = new Vector3(0.0f, 0.5f, 0.0f);
        }
        
        //타겟을 임의의 위치로 이동
        this.target.localPosition = new Vector3(
            Random.value * 8 - 4, 0.5f, Random.value * 8 - 4);
    }
    /// <summary>
    /// 각 에피소드에서 나온 정보를 수집하여 Python API에 전송하기 위한 메소드
    /// </summary>
    /// <param name="sensor"></param>
    public override void CollectObservations(VectorSensor sensor)
    {
        sensor.AddObservation(this.target.localPosition);       // 타겟의 위치
        sensor.AddObservation(this.transform.localPosition);    // Sphere의 위치
        sensor.AddObservation(this.rBody.velocity.x);           // Sphere의 x속도
        sensor.AddObservation(this.rBody.velocity.z);           // Sphere의 z속도
    }
 
    /// <summary>
    /// 미션에 달성했다면 주어지는 보상 메소드
    /// </summary>
    /// <param name="vectorAction"></param>
    public override void OnActionReceived(float[] vectorAction)
    {
        Vector3 controlSignal = Vector3.zero;
        controlSignal.x = vectorAction[0];
        controlSignal.z = vectorAction[1];
        rBody.AddForce(controlSignal * speed);
 
        float distanceToTarget = Vector3.Distance(
            this.transform.localPosition, target.localPosition);
 
        // 닿았을 경우
        if (distanceToTarget < 1.42f)
        {
            AddReward(1.0f);
            EndEpisode();
        }
 
        // 못 닿았을 경우
        if (this.transform.localPosition.y < 0)
        {
            EndEpisode();
        }
    }
}

 

 
13. RollerAgent Inspector
 a. Decision Requester 컴포넌트 추가
 - Decision Period를 10으로 설정
 
 14. Behavior Parameter
 - Behavior Name : RollerBall
 - Vector Observation
  - Space Size : 8
 - Vector Action
  - Space Type : Continuous
  - Space Size : 2
 
15. rollerball_config.yaml 생성
 a. 프로젝트 폴더 내에 config 폴더 생성
 b. 그 안에 rollerball_config.yaml 파일 생성
 c. 내용은 다음과 같이 설정


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

behaviors:
  RollerBall:
    trainer_type: ppo
    hyperparameters:
      batch_size: 10
      buffer_size: 100
      learning_rate: 0.0003
      beta: 0.005
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
    network_settings:
      normalize: true
      hidden_units: 128
      num_layers: 2
      vis_encode_type: simple
    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0
    keep_checkpoints: 5
    checkpoint_interval: 500000
    max_steps: 500000
    time_horizon: 64
    summary_freq: 1000
    threaded: true

 

(└> 여기서 "RollerBall" 은 Agent에 붙어있는 Behavior Parameters 컴포넌트의 Behavior Name 의 값이 들어가야 한다)
 
16. 시뮬레이션 시작
 
a. cmd를 켠다
b. cd 명령어를 사용하여 rollerball_config.yaml 파일이 있는 곳으로 현재 위치를 이동시킨다
c. 시뮬레이션 명령어를 입력한다.
 
- 명령어는 다음과 같다.
mlagents-learn rollerball_config.yaml --run-id=RollerBall
(└> 여기서 --run-id= 다음에 적는 이름은 result에 생성될 결과물의 이름)
 
d. 학습이 끝나면 cmd창에서 Ctrl + C를 눌러 종료한다
e. 프로젝트 폴더 내에 result 폴더가 생겼을 것이다. 그 안에 있는 *.nn 파일을 Agent에 Assign한다.
f. 그 후 유니티에서 실행하면 학습한 대로 돌아감
 
 

'Unity > 수업내용' 카테고리의 다른 글

UGUI / NGUI / AssetBundle / CDN 개념 요약 (0)	2020.08.06
특정 GameObject의 하위 GameObject / Transform 검색하기 (0)	2020.08.06
Coroutine (0)	2020.05.29
쿠키런 점프 Image (0)	2020.05.28
<Unity Docs 읽을 목록> (0)	2020.05.27

바닥 프로그래밍

바닥 프로그래밍

태그

최근글

댓글

공지사항

아카이브

'Unity > 수업내용' 카테고리의 다른 글

관련글

티스토리툴바