Introduction to Python Practical 2

Similar documents
Introduction to MATLAB Practical 2

Satellite project, AST 1100

Open Cluster Research Project

AstroBITS: Open Cluster Project

Your work from these three exercises will be due Thursday, March 2 at class time.

PROBLEM SET #1. Galactic Structure 37 pts total. due Tuesday, 2/19/2019

Creative Data Mining

New Mexico Tech Hyd 510

Source localization in an ocean waveguide using supervised machine learning

Lab 2 Astronomical Coordinates, Time, Focal Length, Messier List and Open Clusters

Satellite project, AST 1100

A Reconstruction of Regional and Global Temperature for the Past 11,300 Years Marcott et al STUDENT ACTIVITY

Using the EartH2Observe data portal to analyse drought indicators. Lesson 4: Using Python Notebook to access and process data

LAB B. The Local Stellar Population

MATH20411 PDEs and Vector Calculus B

Newton s Cooling Model in Matlab and the Cooling Project!

22 Approximations - the method of least squares (1)

Galactic Census: Population of the Galaxy grades 9 12

KEELE UNIVERSITY SCHOOL OF CHEMICAL AND PHYSICAL SCIENCES Year 1 ASTROPHYSICS LAB. WEEK 1. Introduction

Star Cluster Photometry and the H-R Diagram

HR Diagram of Globular Cluster Messier 80 Using Hubble Space Telescope Data

The Night Sky [Optional - only for those interested] by Michael Kran - Thursday, 2 October 2008, 03:49 PM

Mitaka and Milky Way texture map

Shape Measurement: An introduction to KSB

EQUATION OF A CIRCLE (CENTRE A, B)

Department of Chemical Engineering University of California, Santa Barbara Spring Exercise 2. Due: Thursday, 4/19/09

Okay now go back to your pyraf window

The Hertzsprung-Russell Diagram

ASTRO 1050 LAB #9: Parallax and Angular Size-Distance relations

Introduction to Computational Neuroscience

f = Xw + b, We can compute the total square error of the function values above, compared to the observed training set values:

The Curvature of Space and the Expanding Universe

Analyzing the Earth Using Remote Sensing

SKINAKAS OBSERVATORY. Astronomy Projects for University Students PROJECT GALAXIES

Least squares and Eigenvalues

Experiment 1: Linear Regression

Photometry of Messier 34

Lab 2: Photon Counting with a Photomultiplier Tube

ArcGIS for Applied Economists Session 2

Lecture 29. Our Galaxy: "Milky Way"

The Rain in Spain - Tableau Public Workbook

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Astronomy 102 Lab: Stellar Parallax and Proper Motion

Open Cluster Photometry: Part II

THE PLEIADES OPEN CLUSTER

MATH0328: Numerical Linear Algebra Homework 3 SOLUTIONS

Assignment 2: Exploring the CESM model output Due: Tuesday February Question 1: Comparing the CESM control run to NCEP reanalysis data

Exercise 1.0 THE CELESTIAL EQUATORIAL COORDINATE SYSTEM

MontePython Exercises IFT School on Cosmology Tools

Tutorial for reading and manipulating catalogs in Python 1

Midterm Observing Project: RR Lyrae, Rapidly Pulsating Stars

Photometry of Supernovae with Makali i

Planet Hunting with Python

Where on Earth are We? Projections and Coordinate Reference Systems

HIRES 2017 Syllabus. Instructors:

CE 365K Exercise 1: GIS Basemap for Design Project Spring 2014 Hydraulic Engineering Design

MEASURING DISTANCE WITH CEPHEID VARIABLES

ASTRONOMY 460: PROJECT INTRO - GALACTIC ROTATION CURVE

IN REPORT: Plate Scale and FOV of CCD for Each Telescope using Albireo Stars

DIMENSION REDUCTION AND CLUSTER ANALYSIS

Astrometry in Gaia DR1

PARALLAX AND PROPER MOTION

Spectral Analysis of High Resolution X-ray Binary Data

Large Scale Structure of the Universe Lab

Prelab 7: Sunspots and Solar Rotation

Assignment #12 The Milky Way

CS1110 Lab 3 (Feb 10-11, 2015)

Math 515 Fall, 2008 Homework 2, due Friday, September 26.

ENV101 EARTH SYSTEMS

Gaia Data Release 1: Datamodel description

FleXScan User Guide. for version 3.1. Kunihiko Takahashi Tetsuji Yokoyama Toshiro Tango. National Institute of Public Health

The Earth and the Sky

The IRS Flats. Spitzer Science Center

Interacting Galaxies

Best Pair II User Guide (V1.2)

Life Cycle of Stars. Photometry of star clusters with SalsaJ. Authors: Daniel Duggan & Sarah Roberts

CONFIRMATION OF A SUPERNOVA IN THE GALAXY NGC6946

Electro Magnetic Field Dr. Harishankar Ramachandran Department of Electrical Engineering Indian Institute of Technology Madras

Mimir NIR Spectroscopy Data Processing Cookbook V2.0 DPC

0. Introduction 1 0. INTRODUCTION

Prosurv LLC Presents

9.6. Other Components of the Universe. Star Clusters. Types of Galaxies

Page # Astronomical Distances. Lecture 2. Astronomical Distances. Cosmic Distance Ladder. Distance Methods. Size of Earth

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

Sep 09, Overview of the Milky Way Structure of the Milky Way Rotation in the plane Stellar populations

Geographers Perspectives on the World

CESAR Science Case. Jupiter Mass. Calculating a planet s mass from the motion of its moons. Teacher

Ay 1 Lecture 2. Starting the Exploration

General Analytical. Telescope Pointing Model. Written by. Marc W. Buie

LAB 6 SUPPLEMENT. G141 Earthquakes & Volcanoes

Yes, the Library will be accessible via the new PULSE and the existing desktop version of PULSE.

EEE161 Applied Electromagnetics Laboratory 1

Hubble's Law and the Age of the Universe

Assignment #0 Using Stellarium

Global Atmospheric Circulation Patterns Analyzing TRMM data Background Objectives: Overview of Tasks must read Turn in Step 1.

Data Visualization with GIS, Dr. Chris Badurek Visualization and Computing Teacher s Workshop. Part 1: Getting Started with Tectonic Hot Spot Mapping

MASS DETERMINATIONS OF POPULATION II BINARY STARS

TOPCAT basics. Modern Astrophysics Techniques. Contact: Mladen Novak,

Astrophysics I - HS 2017 H.M. Schmid, Institute for Particle Physics and Astrophysics, ETH Zurich HIT J22.2

Lab 5. Parallax Measurements and Determining Distances. 5.1 Overview

Transcription:

Introduction to Python Practical 2 Daniel Carrera & Brian Thorsbro November 2017 1 Searching through data One of the most important skills in scientific computing is sorting through large datasets and extracting the information that is interesting. On your PC, create a folder for the exercises. Download the Hipparcos catalogue as a text file from the web page for ASTM13 (below). Be careful not to save it as an html file. Check the file with Notepad to make sure that it only contains data. http://www.astro.lu.se/education/utb/astm13/hipparcos.txt In the following we assume that the data file is called hipparcos.txt. Start python/spyder. Change the Current Directory to your folder. Open the script editor (press New File ) and type in the following script: # Load the functions from the numpy and matplotlib libraries from numpy import * from matplotlib.pyplot import * # Read from the file into the array data(:,:) data = loadtxt( hipparcos.txt ); # Columns. HIP = data[..., 0] # (---) Hipparcos number. l = data[..., 1] # (deg) Star longitude. b = data[..., 2] # (deg) Star latitude. p = data[..., 3] # (mas) Parallax. ul = data[..., 4] # (mas/yr) Proper motion, l direction. ub = data[..., 5] # (mas/yr) Proper motion, b direction. ep = data[..., 6] # (mas) Standard Error in parallax. el = data[..., 7] # (mas/yr) Standard Error in proper motion, l direction. eb = data[..., 8] # (mas/yr) Standard Error in proper motion, b direction. V = data[..., 9] # (mag) Visual magnitude. col = data[...,10] # (mag) Colour index, B-V. mult = data[...,11] # (---) Stellar multiplicity. 1

Extract all stars with a parallax less than 1 mas. This is a list of all stars farther than 1000 pc. # # mask is an array that contains ones and zeros. # one == True == Parallax less than 1 mas # mask = (p < 1) subset = p[mask] print(size(subset)) Stars with parallax less than 1 mas are located farther than 1000 pc. Determine the fraction of the Hipparcos catalogue that is farther than 1000 pc. print(size(p[mask]) / size(p)) 1.1 Map of Hipparcos stars In this section we are going to make a map of the local region of the galaxy, in order the study the distribution of red and blue stars. To do this well, we need to think about the best way to project the celestial sphere onto a flat plane so as to minimize distortion. I recommend a little-known projection by Soviet cartographer Vladimir Kavrayskiy (1884-1954). It has a simple formula, and does a very good job at preserving area and shape: x = 3 l 1 ( ) 2 b 2 3 π y = b Where b [ π/2, π/2] and l [ π, π] are latitude and longitude (respectively). For illustration, here is a map of the Earth in this projection (source: wikipedia.org): 2

This projection is called Kavrayskiy VII. Enter the following lines to produce a Kavraiskiy VII projection of the red and blue stars in the Hipparcos catalog: # Convert latitude and longitude to radians. b = b * pi/180 l = l * pi/180 # Wrap around after longitude > pi, so it goes from -pi to pi. l[l > pi] = l[l > pi] - 2*pi # Do the Kavrayskiy VII projection. y = b x = l*3/2 * sqrt( 1/3 - (b/pi)**2 ) # Masks for blue and red stars. blue = (col < 0) # Stars with colour index B-V < 0.0 red = (col > 1) # Stars with colour index B-V > 1.0 # Final plot. figure(1) plot( x[red], y[red], r., x[blue], y[blue], b. ) legend( B-V > 1, B-V < 0 ) title( Hipparcos stars - Kavrayskiy VII projection ) ylabel( Galactic latitude ) xlabel( Projection of galactic longitude ) In the end, you should have a plot similar to this: 3

Discuss the distribution of red and blue stars with your classmates. Here are some interesting questions that you might want to think about: What are the main differences between the blue and red stars? Where in this picture can you find the galactic centre? (latitude = 0, longitude = 0). What could cause the two prominent over-densities of blue stars? What could cause the deep void of blue stars near the centre of the plot? Why does this void not affect red stars as much? Are there important biases in the sample? Why are there some blue stars at high galactic latitudes? Try to have an interesting discussion with your colleagues before moving to the next session. 2 Generating random data In science it is often necessary or useful to generate simulated data. For example, the first problem set for ASTM21 is to take the 2D galaxy distribution in the Hubble Ultra Deep Field (HUDF) and determine whether it is uniform. For this project it may be helpful to produce a few simulated galaxy distributions that are uniform and try to write a statistic that can distinguish the simulated data from the real Hubble data. Enter the following code in python. Here we use the random.uniform function to produce a star field with a uniform distribution. # Number of stars. nstars = 1000 # X and Y position. u_x = random.uniform(0,1,nstars) u_y = random.uniform(0,1,nstars) # Plot the star field. figure(2) plot(u_x,u_y, b. ) Enter the following code. This version uses normal instead of uniform. The function normal produces random values following the standard normal distribution (zero mean, variance one). Thus, the following version produces a star field more akin to a star cluster. 4

# Normal distribution c_x = random.normal(0,1,nstars) c_y = random.normal(0,1,nstars) # Plot the cluster figure(3) plot(c_x,c_y, b. ) Lastly, we would like to combine these two datasets. That would produce a more realistic star field around an open cluster. The star field would have a combination of stars from the cluster and background stars. Start by plotting the two data sets. You will have to modify the data sets slightly to get reasonable results. Here is my solution, but I encourage you to experiment. figure(4) plot(u_x*10-5,u_y*10-5, r.,c_x,c_y, b. ) Once you are happy with your plot, join the data sets accordingly: x = concatenate((u_x*10-5, c_x), axis=0) y = concatenate((u_y*10-5, c_y), axis=0) You have now produced a simulated (x,y) dataset that is similar to what you might observe on a CCD image of a star cluster. 3 Sample application We want to pick a star at random. Because the data was randomly generated, we can pick star 1. Plot the star field in blue and put a red + on star 1: x1 = x[0] y1 = y[0] figure(5) plot(x,y, b.,x1,y1, r+ ) First, find all the neighbours of star 1. The definition of neighbour is a bit arbitrary. In the following example I define it as the set of stars within distance 1 of star 1. But you should experiment with different distance values: 5

# Distance that defines a neighbour d = 1 # Find the distance to every other star. r = sqrt( (x1 - x)**2 + (y1 - y)**2 ) # Select those that have r < d. nbhr_x = x[ r < d ] nbhr_y = y[ r < d ] # Plot the neighbours with a black circle. plot(x,y, b.,x1,y1, r+,nbhr_x,nbhr_y, ko,x1,y1, r+ ) # Count the number of neighbours of star 1. num_neighbours = sum( r < 1 ) print(num_neighbours) We can write a python function to help us find the globular cluster in our artificial star field. First, we can define the local density at the point (xp,yp) as the number of stars within distance d of (xp,yp). Write a function to compute the local density of a star field, the indentation is important as it signifies to python which lines are part of your cunftion. Put the function in your script file, such that it appears before you need to use it the first time: # function: find density around xp,xy given stars in starsx,starsy # returns the density def density( starsx, starsy, xp, yp ): d = 1 r = sqrt( (xp - starsx)**2 + (yp - starsy)**2 ) return sum( r < d ) Confirm that this function is correct by confirming that it gives the same number of neighbours that you obtained earlier for star 1: print(density(x,y,x1,y1)) Now plot the local density along the X axis. rho = zeros(21) # allocate memory yp = 0 for i in range(0,21): # element 0 is included, but 21 is not included! xp = i - 10 rho[i] = density(x,y,xp,yp) figure(6) plot(arange(-10,11,1),rho) 6

The plot is probably not very smooth. Can you improve the for loop to produce a better plot? Based on this plot, how would you define the edge of the cluster? Alternatively, we could use the density function to determine which stars belong to the star cluster. A simple implementation would look like this: cluster_x = [] cluster_y = [] rho_min = 50 for i in range(0,size(x)): if density(x,y,x[i],y[i]) > rho_min: cluster_x = insert(cluster_x,size(cluster_x),x[i],axis=0) cluster_y = insert(cluster_y,size(cluster_y),y[i],axis=0) figure(7) plot(x,y, b.,cluster_x,cluster_y, r+ ) Experiment with different values of rho min and d. Choose a good set of parameters to find the star cluster. Compare answers with your class mates. Ideally you would like to find a routine that reliably finds the cluster for all the generated data sets. Could the routine be improved if it was based on the mean density of the star field? Try to implement a cluster-finding routine that uses the mean star density rather than a hard-coded value. 4 Code profiling Solving linear systems (Ax = b) is one of the most common and most expensive operations in scientific computing. For example, this operation is used for linear least squares optimization. In the following code example, we use the python library timeit to profile the cost of this operation: import timeit t = zeros(500) for n in range(1,501): A = random.uniform(0,1,(n,n)) b = random.uniform(0,1,n) tic = timeit.default_timer() for i in range(0,5): linalg.lstsq(a,b) toc = timeit.default_timer() t[n-1] = (toc-tic) / 5 # n starts on 1 but first index is 0 figure(8) plot(t) 7

There is a risk that some times random.uniform() will produce a matrix that just happens to be easy to invert. Can you suggest a way to improve the above for-loop to minimize this risk? 8