Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks
|
|
- Walter Morton
- 5 years ago
- Views:
Transcription
1 Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Tian Li tian.li@pku.edu.cn EECS, Peking University Abstract Since laboratory experiments for exploring astrophysical processes are impossible due to limited human life spans, astrophysicists generally employ two approaches: inference from observations, and predictions from simulations. In the first case, observers take and combine data to constrain the underlying physical processes in order to infer evolutionary pathways from the data. In the second case, simulators combine proposed prescriptions for the relevant physical processes and implement them in a simulation or a semi-analytic model and then test its predictions against the observations. Here we propose a new approach using generative models for data-driven exploration of physical processes in astrophysics. We use the problem of the quenching of star-formation in galaxies to show how we can independently manipulate physical attributes by encoding objects, galaxies in this case, into the latent space of a neural network, and use walks in latent space to forward model galaxy evolution. We show that changes in Specific-Star-Formation rate (SSFR) and Bulge-To-Disk ratio (BTDR) largely, but not entirely, describe the galaxy quenching process (i.e., the galaxy evolution process). Keywords: galaxy evolution, GAN, image processing 1. Introduction The whole universe is isomorphic with many galaxies scattering around it. The astrophysicists have long relied on manually crafted physical models to explore the process of galaxy evolution. However, there may be some subtle hidden information that those physical models cannot capture even with sufficient domain prior. Since deep learning has the potential to learn a data-driven prior to go beyond model-driven limits and generative models provide a way to generate thousands of high-quality samples conditioning on some given attributes, we ask, can we adopt the idea of generative models to learn the process of galaxy evolution? We develop a neural network model to answer this question. We first give a formal problem definition and then summarize the challenges together with our contributions. Project in Summer 2017
2 Figure 1: Input galaxy Figure 2: Output images indicating the evolution process of the galaxy 1.1. Problem definition We model the state of galaxies as static RGB images 1. Now, we formally define the input and output of our model (during inference). Input: a galaxy image x with a label y, y is the value of one of the galaxy s physical attributes/properties Output: a series of images indicating the state of the input galaxies conditioning on the target discrete attribute values while preserving the identity of the image Example. We present an example (input, output) pair here. The input (Figure 1) is an observed galaxy picture, the output (Figure 2) is a series of synthesized images conditioning on ten different categories (class 0 to class 9 from left to right) of one physical property which is a proxy of the galaxy s age. Here, class 0 to class 9 are normalized values of this property in log space after discretization. Note that, class k may be 1 billion years earlier than class k + 1. The galaxy in the white rectangle represents the galaxy conditioning on its real physical attributes (i.e., It shows the result of the recovery of the original image) Challenges In order to develop a reasonable data-driven approach towards this problem, there are three major challenges that we face. (Models) The galaxy images appear very similar so that it adds difficulty to train a neural network to capture the subtle features in each galaxy category. In order to embed the images to the latent space, we need to build a good auto-encoder first to recover the real images. We start with BEGAN [1], using the discriminator in BEGAN as the auto-encoder and we find that it recovers the real images very well. In order to manipulate the attributes of the images, we apply the idea in Fader Network [2] to enforce the z-vector produced by the encoder invariant of the label. This is achieved by co-training the discriminator 1 We pre-processed the astronomy images in FITS format retrieved from the Sloan Digital Sky Survey database into JPEGs. 2
3 and auto-encoder such that when it converges, the discriminator cannot tell the real label from the z-vector. During inference, we can generate images conditioning on some label by incorporating the label into the decoder. In addition, we add a novel regressor module to further improve the results. (Implementation) The implementation, especially the training procedure should also be considered carefully as the neural network behaviors are unpredictable and generative adversarial networks are hard to train. We will give a detailed description of the implementation in Section 3 and Section 4.1. (Evaluation) There is no ground truth so the evaluation should be carried out carefully. First, we ask the astrophysicists to look at our results and give a qualitative evaluation. Further, we conduct a quantitative evaluation to see whether the synthesized images follow the distribution of real galaxies. Limitations. It s hard to exactly evaluate to what extent our results are reasonable. Also, our model can only support synthesizing images conditioning on one attribute in the current version. In addition, modeling the galaxy states as 2-D images may be over simplified, so we need to think about how to address the problem in the 3-D domain. Overview. The rest of the paper is organized as follows. We give the necessary background knowledge in Section 2. We describe our methods in Section 3, report our results in Section 4 and conclude the paper in Section Background It is a fundamental question in astronomy what are the determining factors in transforming from star-forming galaxies to quenching galaxies (i.e., what are the determining factors in galaxy evolution)? Astrophysicists have been trying to answer this question for decades and there are two import physical properties changing regularly with galaxy evolution: the SSFR value (Specific-Star-Formation Rate) and the BTDR (Bulge-To-Disk Ratio) value. In our work, we mainly focus on these two properties. SSFR. The expected transformation of galaxies when the SSFR value gets higher is that the color of the galaxy would turn from blue to red/orange. We normalize the values of SSFR as labels into class [0,1,..., 9]. BTDR. The expected transformation of galaxies when the BTDR value gets higher is that the center of the galaxy becomes larger and brighter. We normalize the values of BTDR as labels into class [0,1,..., 9]. Note that, these two physical properties are not independent from each other, so when the SSFR values gets higher, we also expect the galaxies to become brighter and vice versa. 3
4 Figure 3: Fader Network Architecture 3. Methods 3.1. Model The general idea here is to adopt the current advancement of generative models to first encode the galaxies into latent space, and then decode the vector in latent space, along with target labels, to generate a series of images simulating what the galaxy looks like billions of years earlier/later. Our base model is Fader Network [2] (with some minor modifications). As is demonstrated in Figure 3, Fader Network is composed of two parts: an autoencoder and a discriminator. The auto-encoder consists of a decoder and an encoder (both are neural networks). The encoder takes as input an image x, and outputs E(x), which is what we call the z-vector. The decoder takes as input the z-vector and the label y, 2 and outputs an image with the same size as the input one. It aims to recover the original images during training as well as producing E(x) to fool the discriminator. The discriminator takes as input the z-vector, and tries to classify it into a correct category. As in GANs, this corresponds to a two-player game where the discriminator aims at maximizing its ability to identify attributes, while the encoder aims at preventing the discriminator from getting better. With this discriminator, the encoder can learn an invariant latent representation using an adversarial formulation of the learning objective. The loss function of the discriminator is: L dis (θ dis θ enc ) = 1 m (x,y) D logp θ dis (y E θenc (x)) Together with the adversarial training part (i.e., the discriminator), the loss function of the auto-encoder is: L(θ enc, θ dis θ dis ) = 1 m (x,y) D D θ dec (E θenc (x), y) x 2 λ E logp θdis (1 y E θenc (x)) 2 During training the auto-encoder, the label is the real label; during the inference phase, the label is fed in as our target labels 4
5 Figure 4: The final model we used (Fader Network plus a simple but fundamental regressor) Fader Network adopts the discriminator to do the adversarial training to force the z- vector independent of the label. During inference, we feed in the auto-encoder with an image along with our target labels and its output would be an image conditioning on that label while preserving the identity of the original image Calibration In order to further improve the results, we make another non-trivial twist by adding a cute calibration module a regressor 3 and modify the loss function of the auto-encoder accordingly. As is shown in Figure 4, the regressor R takes an image (whatever real or fake) as input, and predicts a value indicating the label on some attribute (e.g., the SSFR or BTDR value of that image). We first train R with real images x and the corresponding labels y, then adopt the trained regressor R to predict for synthesized images R θreg (D θdec (E θenc (x), y)). After that, we calculate the MSE loss 4 between targets and predicted labels (y and R θreg (D θdec (E θenc (x), y)), respectively). We minimize the loss so that during back propagation, the auto-encoder part will be optimized towards leading to smaller M SE loss (i.e., the synthesized images would be more realistic hopefully). The new loss function for the auto-encoder: L(θ enc, θ dis θ dis ) = 1 m (x,y) D D θ dec (E θenc (x), y) x 2 λ E logp θdis (1 y E θenc (x)) + y {0,...,9} (R θ reg (D θdec (E θenc (x), y)) y) 2 λ R Implementation We first re-implement Fader Network [2], and then experiment with some minor modifications in the architecture. We sum up the modifications here. We experiment with several auto-encoder models. We discover that the one from BEGAN [1] (in the BEGAN paper, the auto-encoder is called discriminator, though) works well for 3 Although predicting the label of images seem like a classification problem, it makes sense to do regression here because for example, the errors between predicted 0 and 1 are smaller than those between 0 and 2 in a regression problem, which is reasonable. While in classification, the errors are the same. 4 By our experiment, we find that the loss function here does not affect the results too much. 5
6 our galaxy dataset and we use that in our final model. For the discriminator, we make slight changes of the original architecture in Fader Network by experimenting different dimensions of the last hidden layer. We use a regression version of ResNet-50 as the regressor. 4. Experiments 4.1. Set up and training Data. We randomly subsample 5000 images from 10 classes to form training set and 1000 other images as testing set. Training procedure. In order to enforce the whole system to update smoothly, we follow the following training procedure: Step 1 : only train the auto-encoder to get a network that can recover the original images very well Step 2 : only train a good regressor with around 1.7 MSE loss. Step 3 : only train the discriminator with a classification accuracy around 92% Step 4 : train the auto-encoder and discriminator together by increasing λ E and λ R slowly 4.2. Results We present our visual results in Figure 5, 6. The columns are separate, independent galaxies in which one column represents one class (from 0 to 9). One row demonstrates the evolution process of one galaxy. The galaxies in the rectangles represent the recovery of the original images (i.e., the target label to feed into the decoder is the same as the real label) Quantitative evaluation We also conduct several quantitative evaluation on our results. Please refer to Figure 7, 8 for details. Here, we only deal with the case where galaxies evolve along the changes of BTDR value. For both scenarios before and after calibration, we train a good regressor on real images and apply it to evaluate synthesized images. In each figure, the X-axis represents the prediction values of the regressor, and the Y-axis indicates the target labels. If the network is trained perfectly, we should expect the (x, y) points to form a diagonal line. Figure 7b, 7c, 8b, 8c are regression results on real images in training/testing sets of the regressor before and after calibration, respectively. It ensures that our regressor is good and fair. The prediction distribution on generated images (Figure 7a, 8a) shows that our model captures the properties of galaxy quenching to a great extent. From the improvement from Figure 7a to Figure 8a, we can see that, the calibration module helps significantly here. 5. Related Work Object attribute manipulation. There is some deep-learning based work to deal with the attribute manipulation problem. E.g., [3] models the manipulation operation as learning residual images and [2] directly outputs a different image by encoding the attribute labels as input to the decoder network. 6
7 Figure 5: visual results of galaxies evolving with the changing SSFR value Figure 6: visual results of galaxies evolving with the changing BTDR value 7
8 (a) regression on generated images (b) regression on real images (c) regression on real images in training sets of the regressor in testing sets of the regressor Figure 7: before calibration (a) regression on generated images (b) regression on real images (c) regression on real images in training sets of the regressor in testing sets of the regressor Figure 8: after calibration Invariant representation of images. There is much work on learning invariant representations using adversarial training, including [4]. We follow [2], to generate the independent z-vector by training a discriminator to fool it into not classifying the z-vector correctly, so that the z-vector contains zero information about the real attributes. Human face aging. Recently, the face aging problem (i.e., predict future looks for a face) has been studied intensively, including physical-model based approaches [5] and deep-learning based approaches [6], etc. There is also some work addressing both face aging and face regression (i.e., estimation of previous looks), including [7]. In our work, we focus on the deep-learning based approach, which is an interesting counterpart of the physical models built by the astrophysicists for decades. GAN and VAE. We apply the Fader Network [2] model in this work, which is a variant of generative adversarial networks, to model the data distribution of galaxies with different physical properties. VAE and GAN lie in two different taxonomies that VAE models the explicit density function while GAN models it implicitly. We believe that a VAE will possibly achieve similar results and we leave the comparison between GAN and VAE in this task as future work. 8
9 6. Conclusion and future work Our work provides a novel data-driven approach to determine dominant physical properties affecting galaxy quenching (i.e., evolution) process. We carefully train and evaluate our models both qualitatively and quantitatively. The comparison between our method and the state-of-the-art physical-models based methods on this task is left for future work. Manipulating galaxies in the 3-D domain is also an interesting direction that we can further explore. We hope that our work can provide physicists with some novel insights on how they conduct their research, given the oceans of knowledge hidden in the data itself. References [1] D. Berthelot, T. Schumm, L. Metz, Began: Boundary equilibrium generative adversarial networks, arxiv preprint arxiv: [2] G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, M. Ranzato, Fader networks: Manipulating images by sliding attributes, arxiv preprint arxiv: [3] W. Shen, R. Liu, Learning residual images for face attribute manipulation, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp [4] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, V. Lempitsky, Domain-adversarial training of neural networks, Journal of Machine Learning Research 17 (59) (2016) [5] J. Suo, X. Chen, S. Shan, W. Gao, Q. Dai, A concatenational graph evolution aging model, IEEE transactions on pattern analysis and machine intelligence 34 (11) (2012) [6] W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu, N. Sebe, Recurrent face aging, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp [7] Z. Zhang, Y. Song, H. Qi, Age progression/regression by conditional adversarial autoencoder, arxiv preprint arxiv: Appendix A. More results Figure A.9 here demonstrates synthesized images generated by first changing the SSFR value, then changing the BTDR value. The galaxies from the left-top to the right-bottom shows a reasonable evolution process. 9
10 Figure A.9: synthesized images generated by first changing the SSFR value, then changing the BTDR value 10
Deep Generative Models for Graph Generation. Jian Tang HEC Montreal CIFAR AI Chair, Mila
Deep Generative Models for Graph Generation Jian Tang HEC Montreal CIFAR AI Chair, Mila Email: jian.tang@hec.ca Deep Generative Models Goal: model data distribution p(x) explicitly or implicitly, where
More informationVisual meta-learning for planning and control
Visual meta-learning for planning and control Seminar on Current Works in Computer Vision @ Chair of Pattern Recognition and Image Processing. Samuel Roth Winter Semester 2018/19 Albert-Ludwigs-Universität
More informationAn overview of deep learning methods for genomics
An overview of deep learning methods for genomics Matthew Ploenzke STAT115/215/BIO/BIST282 Harvard University April 19, 218 1 Snapshot 1. Brief introduction to convolutional neural networks What is deep
More informationLecture 14: Deep Generative Learning
Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationarxiv: v1 [eess.iv] 28 May 2018
Versatile Auxiliary Regressor with Generative Adversarial network (VAR+GAN) arxiv:1805.10864v1 [eess.iv] 28 May 2018 Abstract Shabab Bazrafkan, Peter Corcoran National University of Ireland Galway Being
More informationIntroduction to Machine Learning
Introduction to Machine Learning CS4731 Dr. Mihail Fall 2017 Slide content based on books by Bishop and Barber. https://www.microsoft.com/en-us/research/people/cmbishop/ http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=brml.homepage
More informationCSC321 Lecture 20: Autoencoders
CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete
More informationVariational Autoencoders (VAEs)
September 26 & October 3, 2017 Section 1 Preliminaries Kullback-Leibler divergence KL divergence (continuous case) p(x) andq(x) are two density distributions. Then the KL-divergence is defined as Z KL(p
More informationGenerative Adversarial Networks. Presented by Yi Zhang
Generative Adversarial Networks Presented by Yi Zhang Deep Generative Models N(O, I) Variational Auto-Encoders GANs Unreasonable Effectiveness of GANs GANs Discriminator tries to distinguish genuine data
More informationTowards understanding feedback from supermassive black holes using convolutional neural networks
Towards understanding feedback from supermassive black holes using convolutional neural networks Stanislav Fort Stanford University Stanford, CA 94305, USA sfort1@stanford.edu Abstract Supermassive black
More informationDeep Generative Models. (Unsupervised Learning)
Deep Generative Models (Unsupervised Learning) CEng 783 Deep Learning Fall 2017 Emre Akbaş Reminders Next week: project progress demos in class Describe your problem/goal What you have done so far What
More informationARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD
ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided
More informationTUTORIAL PART 1 Unsupervised Learning
TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationNeural Networks for Machine Learning. Lecture 2a An overview of the main types of neural network architecture
Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network architecture Geoffrey Hinton with Nitish Srivastava Kevin Swersky Feed-forward neural networks These are
More informationDomain-Adversarial Neural Networks
Domain-Adversarial Neural Networks Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand Département d informatique et de génie logiciel, Université Laval, Québec, Canada Département
More informationTwo at Once: Enhancing Learning and Generalization Capacities via IBN-Net
Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Supplementary Material Xingang Pan 1, Ping Luo 1, Jianping Shi 2, and Xiaoou Tang 1 1 CUHK-SenseTime Joint Lab, The Chinese University
More informationDomain adaptation for deep learning
What you saw is not what you get Domain adaptation for deep learning Kate Saenko Successes of Deep Learning in AI A Learning Advance in Artificial Intelligence Rivals Human Abilities Deep Learning for
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationDeep Feedforward Networks. Sargur N. Srihari
Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More informationDeep Feedforward Networks
Deep Feedforward Networks Yongjin Park 1 Goal of Feedforward Networks Deep Feedforward Networks are also called as Feedforward neural networks or Multilayer Perceptrons Their Goal: approximate some function
More informationDeep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Emily Denton 1, Soumith Chintala 2, Arthur Szlam 2, Rob Fergus 2 1 New York University 2 Facebook AI Research Denotes equal
More informationThe Success of Deep Generative Models
The Success of Deep Generative Models Jakub Tomczak AMLAB, University of Amsterdam CERN, 2018 What is AI about? What is AI about? Decision making: What is AI about? Decision making: new data High probability
More informationExperiments on the Consciousness Prior
Yoshua Bengio and William Fedus UNIVERSITÉ DE MONTRÉAL, MILA Abstract Experiments are proposed to explore a novel prior for representation learning, which can be combined with other priors in order to
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationRecurrent Neural Networks with Flexible Gates using Kernel Activation Functions
2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,
More informationCAUSAL GAN: LEARNING CAUSAL IMPLICIT GENERATIVE MODELS WITH ADVERSARIAL TRAINING
CAUSAL GAN: LEARNING CAUSAL IMPLICIT GENERATIVE MODELS WITH ADVERSARIAL TRAINING (Murat Kocaoglu, Christopher Snyder, Alexandros G. Dimakis & Sriram Vishwanath, 2017) Summer Term 2018 Created for the Seminar
More informationNonparametric Inference for Auto-Encoding Variational Bayes
Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationConvolutional Neural Networks. Srikumar Ramalingam
Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationPrediction of Citations for Academic Papers
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationWHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,
WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WITH IMPLICATIONS FOR TRAINING Sanjeev Arora, Yingyu Liang & Tengyu Ma Department of Computer Science Princeton University Princeton, NJ 08540, USA {arora,yingyul,tengyu}@cs.princeton.edu
More informationMachine Learning Summer 2018 Exercise Sheet 4
Ludwig-Maimilians-Universitaet Muenchen 17.05.2018 Institute for Informatics Prof. Dr. Volker Tresp Julian Busch Christian Frey Machine Learning Summer 2018 Eercise Sheet 4 Eercise 4-1 The Simpsons Characters
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationCSC321 Lecture 20: Reversible and Autoregressive Models
CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321 Lecture 20: Reversible and Autoregressive Models 1 / 23 Overview Four modern approaches to generative modeling:
More informationVariational Autoencoder
Variational Autoencoder Göker Erdo gan August 8, 2017 The variational autoencoder (VA) [1] is a nonlinear latent variable model with an efficient gradient-based training procedure based on variational
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha
More informationGenerative Adversarial Networks
Generative Adversarial Networks SIBGRAPI 2017 Tutorial Everything you wanted to know about Deep Learning for Computer Vision but were afraid to ask Presentation content inspired by Ian Goodfellow s tutorial
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationDeep Domain Adaptation by Geodesic Distance Minimization
Deep Domain Adaptation by Geodesic Distance Minimization Yifei Wang, Wen Li, Dengxin Dai, Luc Van Gool EHT Zurich Ramistrasse 101, 8092 Zurich yifewang@ethz.ch {liwen, dai, vangool}@vision.ee.ethz.ch Abstract
More informationChapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang
Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationModular Vehicle Control for Transferring Semantic Information Between Weather Conditions Using GANs
Modular Vehicle Control for Transferring Semantic Information Between Weather Conditions Using GANs Patrick Wenzel 1,2, Qadeer Khan 1,2, Daniel Cremers 1,2, and Laura Leal-Taixé 1 1 Technical University
More information) (d o f. For the previous layer in a neural network (just the rightmost layer if a single neuron), the required update equation is: 2.
1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2011 Recitation 8, November 3 Corrected Version & (most) solutions
More informationDETECTING HUMAN ACTIVITIES IN THE ARCTIC OCEAN BY CONSTRUCTING AND ANALYZING SUPER-RESOLUTION IMAGES FROM MODIS DATA INTRODUCTION
DETECTING HUMAN ACTIVITIES IN THE ARCTIC OCEAN BY CONSTRUCTING AND ANALYZING SUPER-RESOLUTION IMAGES FROM MODIS DATA Shizhi Chen and YingLi Tian Department of Electrical Engineering The City College of
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationSpatial Transformer Networks
BIL722 - Deep Learning for Computer Vision Spatial Transformer Networks Max Jaderberg Andrew Zisserman Karen Simonyan Koray Kavukcuoglu Contents Introduction to Spatial Transformers Related Works Spatial
More informationDynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji
Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient
More informationProbabilistic Graphical Models for Image Analysis - Lecture 1
Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.
More informationGENERATIVE ADVERSARIAL LEARNING
GENERATIVE ADVERSARIAL LEARNING OF MARKOV CHAINS Jiaming Song, Shengjia Zhao & Stefano Ermon Computer Science Department Stanford University {tsong,zhaosj12,ermon}@cs.stanford.edu ABSTRACT We investigate
More informationDeep Convolutional Neural Networks for Pairwise Causality
Deep Convolutional Neural Networks for Pairwise Causality Karamjit Singh, Garima Gupta, Lovekesh Vig, Gautam Shroff, and Puneet Agarwal TCS Research, Delhi Tata Consultancy Services Ltd. {karamjit.singh,
More informationUnsupervised Learning
CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationGenerative Adversarial Networks
Generative Adversarial Networks Stefano Ermon, Aditya Grover Stanford University Lecture 10 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 10 1 / 17 Selected GANs https://github.com/hindupuravinash/the-gan-zoo
More informationarxiv: v1 [astro-ph.im] 5 May 2015
Autoencoding Time Series for Visualisation arxiv:155.936v1 [astro-ph.im] 5 May 215 Nikolaos Gianniotis 1, Dennis Kügler 1, Peter Tiňo 2, Kai Polsterer 1 and Ranjeev Misra 3 1- Astroinformatics - Heidelberg
More informationLearning Deep Architectures for AI. Part II - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model
More informationSome Applications of Machine Learning to Astronomy. Eduardo Bezerra 20/fev/2018
Some Applications of Machine Learning to Astronomy Eduardo Bezerra ebezerra@cefet-rj.br 20/fev/2018 Overview 2 Introduction Definition Neural Nets Applications do Astronomy Ads: Machine Learning Course
More informationarxiv: v3 [cs.lg] 18 Mar 2013
Hierarchical Data Representation Model - Multi-layer NMF arxiv:1301.6316v3 [cs.lg] 18 Mar 2013 Hyun Ah Song Department of Electrical Engineering KAIST Daejeon, 305-701 hyunahsong@kaist.ac.kr Abstract Soo-Young
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationThe Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems
The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems Weinan E 1 and Bing Yu 2 arxiv:1710.00211v1 [cs.lg] 30 Sep 2017 1 The Beijing Institute of Big Data Research,
More informationReconnaissance d objetsd et vision artificielle
Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le
More informationVariational Autoencoders
Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly
More informationLecture 7: Con3nuous Latent Variable Models
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/
More informationSinging Voice Separation using Generative Adversarial Networks
Singing Voice Separation using Generative Adversarial Networks Hyeong-seok Choi, Kyogu Lee Music and Audio Research Group Graduate School of Convergence Science and Technology Seoul National University
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationAn Overview of Edward: A Probabilistic Programming System. Dustin Tran Columbia University
An Overview of Edward: A Probabilistic Programming System Dustin Tran Columbia University Alp Kucukelbir Eugene Brevdo Andrew Gelman Adji Dieng Maja Rudolph David Blei Dawen Liang Matt Hoffman Kevin Murphy
More informationGenerative models for missing value completion
Generative models for missing value completion Kousuke Ariga Department of Computer Science and Engineering University of Washington Seattle, WA 98105 koar8470@cs.washington.edu Abstract Deep generative
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationSummary of A Few Recent Papers about Discrete Generative models
Summary of A Few Recent Papers about Discrete Generative models Presenter: Ji Gao Department of Computer Science, University of Virginia https://qdata.github.io/deep2read/ Outline SeqGAN BGAN: Boundary
More informationDeep Learning Year in Review 2016: Computer Vision Perspective
Deep Learning Year in Review 2016: Computer Vision Perspective Alex Kalinin, PhD Candidate Bioinformatics @ UMich alxndrkalinin@gmail.com @alxndrkalinin Architectures Summary of CNN architecture development
More informationNEAL: A Neurally Enhanced Approach to Linking Citation and Reference
NEAL: A Neurally Enhanced Approach to Linking Citation and Reference Tadashi Nomoto 1 National Institute of Japanese Literature 2 The Graduate University of Advanced Studies (SOKENDAI) nomoto@acm.org Abstract.
More informationGenerative Adversarial Networks, and Applications
Generative Adversarial Networks, and Applications Ali Mirzaei Nimish Srivastava Kwonjoon Lee Songting Xu CSE 252C 4/12/17 2/44 Outline: Generative Models vs Discriminative Models (Background) Generative
More informationarxiv: v1 [cs.lg] 11 Jan 2019
Variation Network: Learning High-level Attributes for Controlled Input Manipulation arxiv:1901.03634v1 [cs.lg] 11 Jan 2019 Gaëtan Hadjeres Sony Computer Science Laboratories Gaetan.Hadjeres@sony.com Abstract
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationLong-Short Term Memory and Other Gated RNNs
Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling
More informationClassifying Galaxy Morphology using Machine Learning
Julian Kates-Harbeck, Introduction: Classifying Galaxy Morphology using Machine Learning The goal of this project is to classify galaxy morphologies. Generally, galaxy morphologies fall into one of two
More informationLearning Particle Physics by Example:
Learning Particle Physics by Example: Accelerating Science with Generative Adversarial Networks arxiv:1701.05927, arxiv:1705.02355 @lukede0 @lukedeo lukedeo@manifold.ai https://ldo.io Luke de Oliveira
More informationGANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG
GANs Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG Machine Learning: Jordan Boyd-Graber UMD GANs 1 / 7 Problems with Generation Generative Models Ain t Perfect
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationAn Introduction to Bioinformatics Algorithms Hidden Markov Models
Hidden Markov Models Outline 1. CG-Islands 2. The Fair Bet Casino 3. Hidden Markov Model 4. Decoding Algorithm 5. Forward-Backward Algorithm 6. Profile HMMs 7. HMM Parameter Estimation 8. Viterbi Training
More informationMultiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis
Multiple Wavelet Coefficients Fusion in Deep Residual Networks for Fault Diagnosis Minghang Zhao, Myeongsu Kang, Baoping Tang, Michael Pecht 1 Backgrounds Accurate fault diagnosis is important to ensure
More informationMachine Learning (CS 567) Lecture 2
Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationEncoder Based Lifelong Learning - Supplementary materials
Encoder Based Lifelong Learning - Supplementary materials Amal Rannen Rahaf Aljundi Mathew B. Blaschko Tinne Tuytelaars KU Leuven KU Leuven, ESAT-PSI, IMEC, Belgium firstname.lastname@esat.kuleuven.be
More informationCSC 411 Lecture 12: Principal Component Analysis
CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised
More informationCOMPARING FIXED AND ADAPTIVE COMPUTATION TIME FOR RE-
Workshop track - ICLR COMPARING FIXED AND ADAPTIVE COMPUTATION TIME FOR RE- CURRENT NEURAL NETWORKS Daniel Fojo, Víctor Campos, Xavier Giró-i-Nieto Universitat Politècnica de Catalunya, Barcelona Supercomputing
More informationDeep Learning Architecture for Univariate Time Series Forecasting
CS229,Technical Report, 2014 Deep Learning Architecture for Univariate Time Series Forecasting Dmitry Vengertsev 1 Abstract This paper studies the problem of applying machine learning with deep architecture
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationNeural networks and optimization
Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional
More informationDeep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering
Deep learning on 3D geometries Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Overview Background Methods Numerical Result Future improvements Conclusion Background
More information