Artificial Intelligence Preprint | 2019-03-02

Artificial Intelligence


SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning (1811.00090v4)

Daoming Lyu, Fangkai Yang, Bo Liu, Steven Gustafson

2018-10-31

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options.This framework features a planner -- controller -- meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.

Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport (1812.04597v2)

Adarsh Subbaswamy, Peter Schulam, Suchi Saria

2018-12-11

Classical supervised learning produces unreliable models when training and target distributions differ, with most existing solutions requiring samples from the target domain. We propose a proactive approach which learns a relationship in the training domain that will generalize to the target domain by incorporating prior knowledge of aspects of the data generating process that are expected to differ as expressed in a causal selection diagram. Specifically, we remove variables generated by unstable mechanisms from the joint factorization to yield the Surgery Estimator---an interventional distribution that is invariant to the differences across environments. We prove that the surgery estimator finds stable relationships in strictly more scenarios than previous approaches which only consider conditional relationships, and demonstrate this in simulated experiments. We also evaluate on real world data for which the true causal diagram is unknown, performing competitively against entirely data-driven approaches.

Jointly Optimizing Diversity and Relevance in Neural Response Generation (1902.11205v1)

Xiang Gao, Sungjin Lee, Yizhe Zhang, Chris Brockett, Michel Galley, Jianfeng Gao, Bill Dolan

2019-02-28

Although recent neural conversation models have shown great potential, they often generate bland and generic responses. While various approaches have been explored to diversify the output of the conversation model, the improvement often comes at the cost of decreased relevance. In this paper, we propose a method to jointly optimize diversity and relevance that essentially fuses the latent space of a sequence-to-sequence model and that of an autoencoder model by leveraging novel regularization terms. As a result, our approach induces a latent space in which the distance and direction from the predicted response vector roughly match the relevance and diversity, respectively. This property also lends itself well to an intuitive visualization of the latent space. Both automatic and human evaluation results demonstrate that the proposed approach brings significant improvement compared to strong baselines in both diversity and relevance.

A hybrid machine-learning algorithm for designing quantum experiments (1812.03183v2)

L. O'Driscoll, R. Nichols, P. A. Knott

2018-12-07

We introduce a hybrid machine-learning algorithm for designing quantum optics experiments that produce specific quantum states. Our algorithm successfully found experimental schemes to produce all 5 states we asked it to, including Schr"odinger cat states and cubic phase states, all to a fidelity of over . Here we specifically focus on designing realistic experiments, and hence all of the algorithm's designs only contain experimental elements that are available with current technology. The core of our algorithm is a genetic algorithm that searches for optimal arrangements of the experimental elements, but to speed up the initial search we incorporate a neural network that classifies quantum states. The latter is of independent interest, as it quickly learned to accurately classify quantum states given their photon-number distributions.

AFS: An Attention-based mechanism for Supervised Feature Selection (1902.11074v1)

Ning Gui, Danni Ge, Ziyin Hu

2019-02-28

As an effective data preprocessing step, feature selection has shown its effectiveness to prepare high-dimensional data for many machine learning tasks. The proliferation of high di-mension and huge volume big data, however, has brought major challenges, e.g. computation complexity and stability on noisy data, upon existing feature-selection techniques. This paper introduces a novel neural network-based feature selection architecture, dubbed Attention-based Feature Selec-tion (AFS). AFS consists of two detachable modules: an at-tention module for feature weight generation and a learning module for the problem modeling. The attention module for-mulates correlation problem among features and supervision target into a binary classification problem, supported by a shallow attention net for each feature. Feature weights are generated based on the distribution of respective feature se-lection patterns adjusted by backpropagation during the train-ing process. The detachable structure allows existing off-the-shelf models to be directly reused, which allows for much less training time, demands for the training data and requirements for expertise. A hybrid initialization method is also intro-duced to boost the selection accuracy for datasets without enough samples for feature weight generation. Experimental results show that AFS achieves the best accuracy and stability in comparison to several state-of-art feature selection algo-rithms upon both MNIST, noisy MNIST and several datasets with small samples.

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings (1902.09980v3)

Tom Everitt, Pedro A. Ortega, Elizabeth Barnes, Shane Legg

2019-02-26

Agents are systems that optimize an objective function in an environment. Together, the goal and the environment induce secondary objectives, incentives. Modeling the agent-environment interaction in graphical models called influence diagrams, we can answer two fundamental questions about an agent's incentives directly from the graph: (1) which nodes is the agent incentivized to observe, and (2) which nodes is the agent incentivized to influence? The answers tell us which information and influence points need extra protection. For example, we may want a classifier for job applications to not use the ethnicity of the candidate, and a reinforcement learning agent not to take direct control of its reward mechanism. Different algorithms and training paradigms can lead to different influence diagrams, so our method can be used to identify algorithms with problematic incentives and help in designing algorithms with better incentives.

Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-based Approach (1902.01073v2)

Santtu Tikka, Antti Hyttinen, Juha Karvanen

2019-02-04

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and experimental source distributions. The search is enhanced via a heuristic and search space reduction techniques. The approach, called do-search, is provably sound, and it is complete with respect to identifiability problems that have been shown to be completely characterized by do-calculus. When extended with additional rules, the search is capable of handling missing data problems as well. With the versatile search, we are able to approach new problems such as combined transportability and selection bias, or multiple sources of selection bias. We also perform a systematic analysis of bivariate missing data problems and study causal inference under case-control design.

Real-time tree search with pessimistic scenarios (1902.10870v1)

Takayuki Osogami, Toshihiro Takahashi

2019-02-28

Autonomous agents need to make decisions in a sequential manner, under partially observable environment, and in consideration of how other agents behave. In critical situations, such decisions need to be made in real time for example to avoid collisions and recover to safe conditions. We propose a technique of tree search where a deterministic and pessimistic scenario is used after a specified depth. Because there is no branching with the deterministic scenario, the proposed technique allows us to take into account far ahead in the future in real time. The effectiveness of the proposed technique is demonstrated in Pommerman, a multi-agent environment used in a NeurIPS 2018 competition, where the agents that implement the proposed technique have won the first and third places.

Markov Game Modeling of Moving Target Defense for Strategic Detection of Threats in Cloud Networks (1812.09660v2)

Ankur Chowdhary, Sailik Sengupta, Dijiang Huang, Subbarao Kambhampati

2018-12-23

The processing and storage of critical data in large-scale cloud networks necessitate the need for scalable security solutions. It has been shown that deploying all possible security measures incurs a cost on performance by using up valuable computing and networking resources which are the primary selling points for cloud service providers. Thus, there has been a recent interest in developing Moving Target Defense (MTD) mechanisms that helps one optimize the joint objective of maximizing security while ensuring that the impact on performance is minimized. Often, these techniques model the problem of multi-stage attacks by stealthy adversaries as a single-step attack detection game using graph connectivity measures as a heuristic to measure performance, thereby (1) losing out on valuable information that is inherently present in graph-theoretic models designed for large cloud networks, and (2) coming up with certain strategies that have asymmetric impacts on performance. In this work, we leverage knowledge in attack graphs of a cloud network in formulating a zero-sum Markov Game and use the Common Vulnerability Scoring System (CVSS) to come up with meaningful utility values for this game. Then, we show that the optimal strategy of placing detecting mechanisms against an adversary is equivalent to computing the mixed Min-max Equilibrium of the Markov Game. We compare the gains obtained by using our method to other techniques presently used in cloud network security, thereby showing its effectiveness. Finally, we highlight how the method was used for a small real-world cloud system.

Deep learning generalizes because the parameter-function map is biased towards simple functions (1805.08522v4)

Guillermo Valle-Pérez, Chico Q. Camargo, Ard A. Louis

2018-05-22

Deep neural networks (DNNs) generalize remarkably well without explicit regularization even in the strongly over-parametrized regime where classical learning theory would instead predict that they would severely overfit. While many proposals for some kind of implicit regularization have been made to rationalise this success, there is no consensus for the fundamental reason why DNNs do not strongly overfit. In this paper, we provide a new explanation. By applying a very general probability-complexity bound recently derived from algorithmic information theory (AIT), we argue that the parameter-function map of many DNNs should be exponentially biased towards simple functions. We then provide clear evidence for this strong simplicity bias in a model DNN for Boolean functions, as well as in much larger fully connected and convolutional networks applied to CIFAR10 and MNIST. As the target functions in many real problems are expected to be highly structured, this intrinsic simplicity bias helps explain why deep networks generalize well on real world problems. This picture also facilitates a novel PAC-Bayes approach where the prior is taken over the DNN input-output function space, rather than the more conventional prior over parameter space. If we assume that the training algorithm samples parameters close to uniformly within the zero-error region then the PAC-Bayes theorem can be used to guarantee good expected generalization for target functions producing high-likelihood training sets. By exploiting recently discovered connections between DNNs and Gaussian processes to estimate the marginal likelihood, we produce relatively tight generalization PAC-Bayes error bounds which correlate well with the true error on realistic datasets such as MNIST and CIFAR10 and for architectures including convolutional and fully connected networks.



Sort:  

Congratulations @wholesome-post! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You received more than 10 upvotes. Your next target is to reach 50 upvotes.

Click here to view your Board
If you no longer want to receive notifications, reply to this comment with the word STOP

Do not miss the last post from @steemitboard:

Carnival Challenge - Collect badge and win 5 STEEM
Vote for @Steemitboard as a witness and get one more award and increased upvotes!

Coin Marketplace

STEEM 0.28
TRX 0.11
JST 0.034
BTC 66258.39
ETH 3170.93
USDT 1.00
SBD 4.07