Discovering information in a complex network using reinforcement learning.

 In this talk we are going to review the fundamentals of deep learning and see how it can be applied to discover information
 in a complex network (e.g., a company computer network). To do this, we will formally define the problem through a finite 
Markov decision process, defining the value functions and the optimal policies. One of the most interesting points in reinforcement 
learning is the estimation of value functions and the discovery of optimal policies. In this talk, we will give a brief overview of 
how we are dealing with this problem using dynamic programming and Monte Carlo methods to design possible reinforcement 
learning solutions based on time difference learning or the so-called n-step bootstrapping approach.