Course Details
Topic 1: Introduction to Reinforcement Learning
- What is Reinforcement Learning (RL)?
- Markov Decision Process (MDP) and RL
- Applications of RL
- RL Algorithms Classifications
Topic 2: OpenAI Gym
- What is OpenAI Gym
- Install OpenAI Gym
- OpenAI Gym Operations
Topic 3: Value Based Q-Learning
- What is Q-Learning
- Q Value and Q-Table
- Bellman Equation
- Q-Learning Algorithm
- Epsilon Greedy Explore-Exploit Strategy
- On-Policy vs Off-Policy Learning
- What is SARSA?
- SARSA Algorithm
Topic 4: Policy-Based Learning
- Policy Based Methods
- Policy Gradient Algorithm
- Implementation of Policy Gradient Algorithm
Topic 5: Overview of Advanced RL Algorithms
- Limitation of Value and Policy-Based Learnings
- Actor-Critic Algorithms
- Deep Reinforcement Algorithms
Topic 6: Model-Based Learning
- What is Model-Based Learnings
- Model-Based Q-Learning Algorithms
Final Assessment
- Written Assessment - Short Answer Questions (WA-SAQ)
- Practical Performance (PP)
Course Info
Promotion Code
Promo or discount cannot be applied to WSQ courses
Minimum Entry Requirement
Knowledge and Skills
- Able to operate using computer functions with minimum Computer Literacy Level 2 based on ICAS Computer Skills Assessment Framework
- Minimum 3 GCE ‘O’ Levels Passes including English or WPL Level 5 (Average of Reading, Listening, Speaking & Writing Scores)
Attitude
- Positive Learning Attitude
- Enthusiastic Learner
Experience
- Minimum of 1 year of working experience.
- Minimum 18 years old
Minimum Software/Hardware Requirement
Software: NIL
Hardware: Windows and Mac Laptops
About Progressive Wage Model (PWM)
The Progressive Wage Model (PWM) helps to increase wages of workers through upgrading skills and improving productivity.
Employers must ensure that their Singapore citizen and PR workers meet the PWM training requirements of attaining at least 1 Workforce Skills Qualification (WSQ) Statement of Attainment, out of the list of approved WSQ training modules.
For more information on PWM, please visit MOM site.
Funding Eligility Criteria
| Individual Sponsored Trainee | Employer Sponsored Trainee |
|
|
|
SkillsFuture Credit:
PSEA:
|
Absentee Payroll (AP) Funding:
SFEC:
|
Steps to Apply Skills Future Claim
- The staff will send you an invoice with the fee breakdown.
- Login to the MySkillsFuture portal, select the course you’re enrolling on and enter the course date and schedule.
- Enter the course fee payable by you (including GST) and enter the amount of credit to claim.
- Upload your invoice and click ‘Submit’
SkillsFuture Level-Up Program
The SkillsFuture Level-Up Programme provides greater structural support for mid-career Singaporeans aged 40 years and above to pursue a substantive skills reboot and stay relevant in a changing economy. For more information, visit SkillsFuture Level-Up Programme
Get Additional Course Fee Support Up to $500 under UTAP
The Union Training Assistance Programme (UTAP) is a training benefit provided to NTUC Union Members with an objective of encouraging them to upgrade with skills training. It is provided to minimize the training cost. If you are a NTUC Union Member then you can get 50% funding (capped at $500 per year) under Union Training Assistance Programme (UTAP).
For more information visit NTUC U Portal – Union Training Assistance Program (UTAP)
Steps to Apply UTAP
- Log in to your U Portal account to submit your UTAP application upon completion of the course.
Note
- SSG subsidy is available for Singapore Citizens, Permanent Residents, and Corporates.
- All Singaporeans aged 25 and above can use their SkillsFuture Credit to pay. For more details, visit www.skillsfuture.gov.sg/credit
- An unfunded course fee can be claimed via SkillsFuture Credit or paid in cash.
- UTAP funding for NTUC Union Members is capped at $250 for 39 years and below and at $500 for 40 years and above.
- UTAP support amount will be paid to training provider first and claimed after end of class by learner.
Appeal Process
- The candidate has the right to disagree with the assessment decision made by the assessor.
- When giving feedback to the candidate, the assessor must check with the candidate if he agrees with the assessment outcome.
- If the candidate agrees with the assessment outcome, the assessor & the candidate must sign the Assessment Summary Record.
- If the candidate disagrees with the assessment outcome, he/she should not sign in the Assessment Summary Record.
- If the candidate intends to appeal the decision, he/she should first discuss the matter with the assessor/assessment manager.
- If the candidate is still not satisfied with the decision, the candidate must notify the assessor of the decision to appeal. The assessor will reflect the candidate’s intention in the Feedback Section of the Assessment Summary Record.
- The assessor will notify the assessor manager about the candidate’s intention to lodge an appeal.
- The candidate must lodge the appeal within 7 days, giving reasons for appeal
- The assessor can help the candidate with writing and lodging the appeal.
- he assessment manager will collect information from the candidate & assessor and give a final decision.
- A record of the appeal and any subsequent actions and findings will be made.
- An Assessment Appeal Panel will be formed to review and give a decision.
- The outcome of the appeal will be made known to the candidate within 2 weeks from the date the appeal was lodged.
- The decision of the Assessment Appeal Panel is final and no further appeal will be entertained.
- Please click the link below to fill up the Candidates Appeal Form.
Job Roles
- Machine Learning Engineer
- Robotics Engineer
- Game Developer (AI-focused)
- AI Research Scientist
- Data Scientist (branching into RL)
- Autonomous Systems Developer
- Simulation Engineer (using RL)
- Optimization Specialist
- AI Product Manager (oversight on RL projects)
- Control Systems Engineer (using RL)
- Finance Quant (using RL for trading strategies)
- NLP Engineer (using RL for certain applications)
- Recommendation System Developer (using RL)
- AI Solutions Architect
- Drone Algorithm Developer.
Trainers
Solomon Soh Zhe Hong: Solomon Soh is a data scientist and AI trainer with extensive experience in reinforcement learning, deep learning, and optimization. At Workforce Optimizer, he spearheaded R&D in Job-Shop Reinforcement Learning, developing solutions that improved operational efficiency by 15% and reduced staffing costs through discrete optimization. He has also supervised 24 machine learning and deep learning projects at IBM Singapore, coaching teams on methodologies, feature engineering, and model deployment.
In his reinforcement learning training, Solomon emphasizes practical applications of RL in scheduling, forecasting, and resource optimization. His teaching covers fundamental RL concepts, policy-based methods, and hands-on exercises with Python frameworks. By blending real-world projects with technical expertise, Solomon ensures learners build both conceptual understanding and applied skills in reinforcement learning.
Tan Woei Ming: Tan Woei Ming is an AI engineer and data scientist with over 15 years of experience specializing in machine learning, deep learning, and intelligent automation. He holds a Master’s degree in Intelligent Systems from the National University of Singapore (NUS) and has led numerous AI projects in predictive analytics, image recognition, and process optimization within the semiconductor and manufacturing industries. His expertise includes Python, PyTorch, TensorFlow, and reinforcement learning (RL), where he applies computational models to improve decision-making and automation in complex systems.
In “Practical Reinforcement Learning for Beginners,” Woei Ming introduces participants to the foundational principles and applications of RL through hands-on coding and simulations. His sessions emphasize understanding key algorithms such as Q-learning and Deep Q-Networks (DQN), and their implementation using Python. By combining theoretical grounding with practical exercises, he helps learners build intuition on how RL agents learn through interaction, preparing them to apply these concepts in robotics, automation, and data-driven optimization tasks.
Dr. Alfred Ang: Dr. Alfred Ang is a technology leader and AI researcher with more than 20 years of experience in artificial intelligence, software engineering, and data analytics. As the CTO and Chief Instructional Designer at Tertiary Infotech, he has spearheaded the design of over 500 accredited programs in emerging technologies, including machine learning, AI ethics, and automation. His expertise spans reinforcement learning, neural networks, and applied AI systems, combining academic depth with extensive industry practice. Dr. Ang’s research-driven teaching bridges theoretical AI models with real-world implementation strategies.
In “Practical Reinforcement Learning for Beginners,” Dr. Ang guides learners through the practical use of reinforcement learning for solving sequential decision-making problems. His sessions explore concepts such as agent-environment interaction, reward optimization, and policy learning using intuitive examples and open-source frameworks. Through project-based learning, he enables participants to develop and experiment with their own RL agents, gaining first-hand experience in training models that can learn and adapt autonomously.
Quah Chee Yong: Quah Chee Yong is an ACLP-certified trainer and data science professional with strong expertise in machine learning, NLP, and AI systems. As AI Solutions Lead at AiDeal Scan, he developed advanced recommender systems and NLP-driven search engines, while at GoWild Singapore he led the data science team to build analytics platforms and chatbot solutions powered by reinforcement learning. He has also served as Data Science Training Lead for SAP and Temasek Polytechnic programs under IMDA, delivering practical AI and ML training for professionals.
In his reinforcement learning training, Quah focuses on helping learners understand RL concepts through applied case studies. His sessions cover Markov decision processes, value iteration, and Q-learning, with examples drawn from customer analytics, recommender systems, and chatbots. By combining commercial project experience with training expertise, Quah equips learners to apply reinforcement learning techniques effectively in business and technical contexts.
Dr Alvin Ang: Dr Alvin Ang is an ACLP-certified trainer with a Ph.D. in Operations Research from Nanyang Technological University and more than a decade of experience in AI, optimization, and applied machine learning. He has taught at NTU, SUSS, Curtin University, and SP Jain School of Global Management, and also served as an IBM Data Science Instructor. His certifications include TensorFlow, machine learning with Python, and deep learning with Keras, supported by extensive experience in AI model design and deployment.
In his reinforcement learning training, Dr Ang introduces learners to RL fundamentals such as exploration vs. exploitation, reward structures, and policy gradients. His teaching emphasizes practical implementation with Python and TensorFlow, ensuring learners can design and evaluate RL models in applied settings. By blending academic rigor with practical coding exercises, Dr Ang enables participants to gain both theoretical depth and real-world problem-solving skills in reinforcement learning.
Customer Reviews (5)
- might recommend Review by Course Participant/Trainee
-
. (Posted on 3/26/2023)1. Do you find the course meet your expectation? 2. Do you find the trainer knowledgeable in this subject? 3. How do you find the training environment - recommended Review by Course Participant/Trainee
-
Informative and it's great to learn practically. Would have been nice to learn how to create a gym for a new unique scenario1. Do you find the course meet your expectation? 2. Do you find the trainer knowledgeable in this subject? 3. How do you find the training environment
I recorded the links I used to install the packages for scripts in an m1 env. Let me know if you need it
Bash files to install packages and environment (Posted on 7/31/2022) - will recommend Review by Course Participant/Trainee
-
Tertiary Courses has put much effort in preparing the material. There is a well balance between the theory and coding. Having said that, RL is also a pretty tough subject. Recommend to split two training days into non consecutive days or can considering to increase 2-day training to 3-day training. (Posted on 6/17/2022)1. Do you find the course meet your expectation? 2. Do you find the trainer knowledgeable in this subject? 3. How do you find the training environment - will recommend Review by Course Participant/Trainee
-
. (Posted on 6/17/2022)1. Do you find the course meet your expectation? 2. Do you find the trainer knowledgeable in this subject? 3. How do you find the training environment - will recomendation Review by Course Participant/Trainee
-
More applications to commercial domains1. Do you find the course meet your expectation? 2. Do you find the trainer knowledgeable in this subject? 3. How do you find the training environment
Good attempt for a difficult topic, keep it up! (Posted on 1/23/2022)








