Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks

Size: px

Start display at page:

Download "Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks"

Walter Morton
5 years ago
Views:

1 Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Tian Li tian.li@pku.edu.cn EECS, Peking University Abstract Since laboratory experiments for exploring astrophysical processes are impossible due to limited human life spans, astrophysicists generally employ two approaches: inference from observations, and predictions from simulations. In the first case, observers take and combine data to constrain the underlying physical processes in order to infer evolutionary pathways from the data. In the second case, simulators combine proposed prescriptions for the relevant physical processes and implement them in a simulation or a semi-analytic model and then test its predictions against the observations. Here we propose a new approach using generative models for data-driven exploration of physical processes in astrophysics. We use the problem of the quenching of star-formation in galaxies to show how we can independently manipulate physical attributes by encoding objects, galaxies in this case, into the latent space of a neural network, and use walks in latent space to forward model galaxy evolution. We show that changes in Specific-Star-Formation rate (SSFR) and Bulge-To-Disk ratio (BTDR) largely, but not entirely, describe the galaxy quenching process (i.e., the galaxy evolution process). Keywords: galaxy evolution, GAN, image processing 1. Introduction The whole universe is isomorphic with many galaxies scattering around it. The astrophysicists have long relied on manually crafted physical models to explore the process of galaxy evolution. However, there may be some subtle hidden information that those physical models cannot capture even with sufficient domain prior. Since deep learning has the potential to learn a data-driven prior to go beyond model-driven limits and generative models provide a way to generate thousands of high-quality samples conditioning on some given attributes, we ask, can we adopt the idea of generative models to learn the process of galaxy evolution? We develop a neural network model to answer this question. We first give a formal problem definition and then summarize the challenges together with our contributions. Project in Summer 2017

Figure 1: Input galaxy Figure 2: Output images indicating the evolution process of the galaxy 1.1. Problem definition We model the state of galaxies as static RGB images 1.

2 Figure 1: Input galaxy Figure 2: Output images indicating the evolution process of the galaxy 1.1. Problem definition We model the state of galaxies as static RGB images 1. Now, we formally define the input and output of our model (during inference). Input: a galaxy image x with a label y, y is the value of one of the galaxy s physical attributes/properties Output: a series of images indicating the state of the input galaxies conditioning on the target discrete attribute values while preserving the identity of the image Example. We present an example (input, output) pair here. The input (Figure 1) is an observed galaxy picture, the output (Figure 2) is a series of synthesized images conditioning on ten different categories (class 0 to class 9 from left to right) of one physical property which is a proxy of the galaxy s age. Here, class 0 to class 9 are normalized values of this property in log space after discretization. Note that, class k may be 1 billion years earlier than class k + 1. The galaxy in the white rectangle represents the galaxy conditioning on its real physical attributes (i.e., It shows the result of the recovery of the original image) Challenges In order to develop a reasonable data-driven approach towards this problem, there are three major challenges that we face. (Models) The galaxy images appear very similar so that it adds difficulty to train a neural network to capture the subtle features in each galaxy category. In order to embed the images to the latent space, we need to build a good auto-encoder first to recover the real images. We start with BEGAN [1], using the discriminator in BEGAN as the auto-encoder and we find that it recovers the real images very well. In order to manipulate the attributes of the images, we apply the idea in Fader Network [2] to enforce the z-vector produced by the encoder invariant of the label. This is achieved by co-training the discriminator 1 We pre-processed the astronomy images in FITS format retrieved from the Sloan Digital Sky Survey database into JPEGs. 2

3 and auto-encoder such that when it converges, the discriminator cannot tell the real label from the z-vector. During inference, we can generate images conditioning on some label by incorporating the label into the decoder. In addition, we add a novel regressor module to further improve the results. (Implementation) The implementation, especially the training procedure should also be considered carefully as the neural network behaviors are unpredictable and generative adversarial networks are hard to train. We will give a detailed description of the implementation in Section 3 and Section 4.1. (Evaluation) There is no ground truth so the evaluation should be carried out carefully. First, we ask the astrophysicists to look at our results and give a qualitative evaluation. Further, we conduct a quantitative evaluation to see whether the synthesized images follow the distribution of real galaxies. Limitations. It s hard to exactly evaluate to what extent our results are reasonable. Also, our model can only support synthesizing images conditioning on one attribute in the current version. In addition, modeling the galaxy states as 2-D images may be over simplified, so we need to think about how to address the problem in the 3-D domain. Overview. The rest of the paper is organized as follows. We give the necessary background knowledge in Section 2. We describe our methods in Section 3, report our results in Section 4 and conclude the paper in Section Background It is a fundamental question in astronomy what are the determining factors in transforming from star-forming galaxies to quenching galaxies (i.e., what are the determining factors in galaxy evolution)? Astrophysicists have been trying to answer this question for decades and there are two import physical properties changing regularly with galaxy evolution: the SSFR value (Specific-Star-Formation Rate) and the BTDR (Bulge-To-Disk Ratio) value. In our work, we mainly focus on these two properties. SSFR. The expected transformation of galaxies when the SSFR value gets higher is that the color of the galaxy would turn from blue to red/orange. We normalize the values of SSFR as labels into class [0,1,..., 9]. BTDR. The expected transformation of galaxies when the BTDR value gets higher is that the center of the galaxy becomes larger and brighter. We normalize the values of BTDR as labels into class [0,1,..., 9]. Note that, these two physical properties are not independent from each other, so when the SSFR values gets higher, we also expect the galaxies to become brighter and vice versa. 3

4 Figure 3: Fader Network Architecture 3. Methods 3.1. Model The general idea here is to adopt the current advancement of generative models to first encode the galaxies into latent space, and then decode the vector in latent space, along with target labels, to generate a series of images simulating what the galaxy looks like billions of years earlier/later. Our base model is Fader Network [2] (with some minor modifications). As is demonstrated in Figure 3, Fader Network is composed of two parts: an autoencoder and a discriminator. The auto-encoder consists of a decoder and an encoder (both are neural networks). The encoder takes as input an image x, and outputs E(x), which is what we call the z-vector. The decoder takes as input the z-vector and the label y, 2 and outputs an image with the same size as the input one. It aims to recover the original images during training as well as producing E(x) to fool the discriminator. The discriminator takes as input the z-vector, and tries to classify it into a correct category. As in GANs, this corresponds to a two-player game where the discriminator aims at maximizing its ability to identify attributes, while the encoder aims at preventing the discriminator from getting better. With this discriminator, the encoder can learn an invariant latent representation using an adversarial formulation of the learning objective. The loss function of the discriminator is: L dis (θ dis θ enc ) = 1 m (x,y) D logp θ dis (y E θenc (x)) Together with the adversarial training part (i.e., the discriminator), the loss function of the auto-encoder is: L(θ enc, θ dis θ dis ) = 1 m (x,y) D D θ dec (E θenc (x), y) x 2 λ E logp θdis (1 y E θenc (x)) 2 During training the auto-encoder, the label is the real label; during the inference phase, the label is fed in as our target labels 4

Figure 4: The final model we used (Fader Network plus a simple but fundamental regressor) Fader Network adopts the discriminator to do the adversarial training to force the z- vector independent of

5 Figure 4: The final model we used (Fader Network plus a simple but fundamental regressor) Fader Network adopts the discriminator to do the adversarial training to force the z- vector independent of the label. During inference, we feed in the auto-encoder with an image along with our target labels and its output would be an image conditioning on that label while preserving the identity of the original image Calibration In order to further improve the results, we make another non-trivial twist by adding a cute calibration module a regressor 3 and modify the loss function of the auto-encoder accordingly. As is shown in Figure 4, the regressor R takes an image (whatever real or fake) as input, and predicts a value indicating the label on some attribute (e.g., the SSFR or BTDR value of that image). We first train R with real images x and the corresponding labels y, then adopt the trained regressor R to predict for synthesized images R θreg (D θdec (E θenc (x), y)). After that, we calculate the MSE loss 4 between targets and predicted labels (y and R θreg (D θdec (E θenc (x), y)), respectively). We minimize the loss so that during back propagation, the auto-encoder part will be optimized towards leading to smaller M SE loss (i.e., the synthesized images would be more realistic hopefully). The new loss function for the auto-encoder: L(θ enc, θ dis θ dis ) = 1 m (x,y) D D θ dec (E θenc (x), y) x 2 λ E logp θdis (1 y E θenc (x)) + y {0,...,9} (R θ reg (D θdec (E θenc (x), y)) y) 2 λ R Implementation We first re-implement Fader Network [2], and then experiment with some minor modifications in the architecture. We sum up the modifications here. We experiment with several auto-encoder models. We discover that the one from BEGAN [1] (in the BEGAN paper, the auto-encoder is called discriminator, though) works well for 3 Although predicting the label of images seem like a classification problem, it makes sense to do regression here because for example, the errors between predicted 0 and 1 are smaller than those between 0 and 2 in a regression problem, which is reasonable. While in classification, the errors are the same. 4 By our experiment, we find that the loss function here does not affect the results too much. 5

6 our galaxy dataset and we use that in our final model. For the discriminator, we make slight changes of the original architecture in Fader Network by experimenting different dimensions of the last hidden layer. We use a regression version of ResNet-50 as the regressor. 4. Experiments 4.1. Set up and training Data. We randomly subsample 5000 images from 10 classes to form training set and 1000 other images as testing set. Training procedure. In order to enforce the whole system to update smoothly, we follow the following training procedure: Step 1 : only train the auto-encoder to get a network that can recover the original images very well Step 2 : only train a good regressor with around 1.7 MSE loss. Step 3 : only train the discriminator with a classification accuracy around 92% Step 4 : train the auto-encoder and discriminator together by increasing λ E and λ R slowly 4.2. Results We present our visual results in Figure 5, 6. The columns are separate, independent galaxies in which one column represents one class (from 0 to 9). One row demonstrates the evolution process of one galaxy. The galaxies in the rectangles represent the recovery of the original images (i.e., the target label to feed into the decoder is the same as the real label) Quantitative evaluation We also conduct several quantitative evaluation on our results. Please refer to Figure 7, 8 for details. Here, we only deal with the case where galaxies evolve along the changes of BTDR value. For both scenarios before and after calibration, we train a good regressor on real images and apply it to evaluate synthesized images. In each figure, the X-axis represents the prediction values of the regressor, and the Y-axis indicates the target labels. If the network is trained perfectly, we should expect the (x, y) points to form a diagonal line. Figure 7b, 7c, 8b, 8c are regression results on real images in training/testing sets of the regressor before and after calibration, respectively. It ensures that our regressor is good and fair. The prediction distribution on generated images (Figure 7a, 8a) shows that our model captures the properties of galaxy quenching to a great extent. From the improvement from Figure 7a to Figure 8a, we can see that, the calibration module helps significantly here. 5. Related Work Object attribute manipulation. There is some deep-learning based work to deal with the attribute manipulation problem. E.g., [3] models the manipulation operation as learning residual images and [2] directly outputs a different image by encoding the attribute labels as input to the decoder network. 6

7 Figure 5: visual results of galaxies evolving with the changing SSFR value Figure 6: visual results of galaxies evolving with the changing BTDR value 7

8 (a) regression on generated images (b) regression on real images (c) regression on real images in training sets of the regressor in testing sets of the regressor Figure 7: before calibration (a) regression on generated images (b) regression on real images (c) regression on real images in training sets of the regressor in testing sets of the regressor Figure 8: after calibration Invariant representation of images. There is much work on learning invariant representations using adversarial training, including [4]. We follow [2], to generate the independent z-vector by training a discriminator to fool it into not classifying the z-vector correctly, so that the z-vector contains zero information about the real attributes. Human face aging. Recently, the face aging problem (i.e., predict future looks for a face) has been studied intensively, including physical-model based approaches [5] and deep-learning based approaches [6], etc. There is also some work addressing both face aging and face regression (i.e., estimation of previous looks), including [7]. In our work, we focus on the deep-learning based approach, which is an interesting counterpart of the physical models built by the astrophysicists for decades. GAN and VAE. We apply the Fader Network [2] model in this work, which is a variant of generative adversarial networks, to model the data distribution of galaxies with different physical properties. VAE and GAN lie in two different taxonomies that VAE models the explicit density function while GAN models it implicitly. We believe that a VAE will possibly achieve similar results and we leave the comparison between GAN and VAE in this task as future work. 8

9 6. Conclusion and future work Our work provides a novel data-driven approach to determine dominant physical properties affecting galaxy quenching (i.e., evolution) process. We carefully train and evaluate our models both qualitatively and quantitatively. The comparison between our method and the state-of-the-art physical-models based methods on this task is left for future work. Manipulating galaxies in the 3-D domain is also an interesting direction that we can further explore. We hope that our work can provide physicists with some novel insights on how they conduct their research, given the oceans of knowledge hidden in the data itself. References [1] D. Berthelot, T. Schumm, L. Metz, Began: Boundary equilibrium generative adversarial networks, arxiv preprint arxiv: [2] G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, M. Ranzato, Fader networks: Manipulating images by sliding attributes, arxiv preprint arxiv: [3] W. Shen, R. Liu, Learning residual images for face attribute manipulation, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp [4] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, V. Lempitsky, Domain-adversarial training of neural networks, Journal of Machine Learning Research 17 (59) (2016) [5] J. Suo, X. Chen, S. Shan, W. Gao, Q. Dai, A concatenational graph evolution aging model, IEEE transactions on pattern analysis and machine intelligence 34 (11) (2012) [6] W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu, N. Sebe, Recurrent face aging, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp [7] Z. Zhang, Y. Song, H. Qi, Age progression/regression by conditional adversarial autoencoder, arxiv preprint arxiv: Appendix A. More results Figure A.9 here demonstrates synthesized images generated by first changing the SSFR value, then changing the BTDR value. The galaxies from the left-top to the right-bottom shows a reasonable evolution process. 9

10 Figure A.9: synthesized images generated by first changing the SSFR value, then changing the BTDR value 10

Deep Generative Models for Graph Generation. Jian Tang HEC Montreal CIFAR AI Chair, Mila

Deep Generative Models for Graph Generation Jian Tang HEC Montreal CIFAR AI Chair, Mila Email: jian.tang@hec.ca Deep Generative Models Goal: model data distribution p(x) explicitly or implicitly, where