Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas


Guia computacional de Astronomia, Manuais, Projetos, Pesquisas de Astronomia

Guia computacional para dados em astronomia

Tipologia: Manuais, Projetos, Pesquisas

2020

Compartilhado em 03/05/2020

gustavo-silva-2w0
gustavo-silva-2w0 🇧🇷

1 documento

1 / 71

Toggle sidebar

Esta página não é visível na pré-visualização

Não perca as partes importantes!

bg1
Version January 8, 2020
Preprint typeset using L
A
T
E
X style openjournal v. 09/06/15
A BEGINNER’S GUIDE TO WORKING WITH ASTRONOMICAL DATA
Markus P¨
ossel
Haus der Astronomie and Max Planck Institute for Astronomy
Contents
1. Introduction 1
1.1. Types of data 2
1.2. Types of tools 3
1.3. Concepts and operations 3
1.4. Software/language choices 4
2. Data basics: images, spectra, tables 5
2.1. Images: Colour, brightness, pixels 5
2.2. Images: PSF and noise 7
2.3. Images: Noise and flatfielding 8
2.4. Images: astronomical information 10
2.5. Spectra 11
2.6. Data cubes 14
2.7. High-level data: catalogues and tables 15
3. SAOImage DS9 and astronomical images 18
3.1. Loading a Hubble image 18
3.2. A first look at the Eagle Nebula M16 19
3.3. Coordinates: Navigating the image 20
3.4. Meta-Information: the FITS header 21
3.5. Making a colour image 22
3.6. Catalogs 23
3.7. Photometry with regions and statistics 24
3.8. Profiles 25
4. TOPCAT and table data 26
4.1. Opening a table file 27
4.2. Making a sky plot 28
4.3. Virtual Observatory (VO) services 28
4.4. Basic ADQL queries 30
4.5. Selections and subsets 32
4.6. More on plotting 34
4.7. Histograms 37
4.8. A quick look at a spectrum 37
5. Getting started with Python 38
5.1. Installing Python 39
5.2. Using Python in Spyder 39
5.3. Modules 41
6. Basic operations with Python 42
6.1. Meet your new versatile calculator 42
6.2. Units and constants 42
6.3. Random numbers 43
6.4. Strings 43
6.5. Conditions 44
6.6. User-defined functions 45
6.7. Timing your code 46
7. Taming long data sets: Lists in Python 46
7.1. A list of galaxies 46
7.2. Doing something element by element 47
7.3. Operations involving more than one list 48
7.4. Creating lists simultaneously 49
7.5. Numpy arrays 50
7.6. Variable typ es, lists, arrays and speed 50
7.7. Strings and base n numbers as lists 51
8. Basic plotting with Python and Matplotlib 52
8.1. Plotting a function 52
8.2. Making a plot look better 52
8.3. Annotating plots 53
8.4. Figure size 54
8.5. Scatter plots 54
8.6. Fitting data 55
8.7. Histograms 56
8.8. Saving figures 57
8.9. Glueing data sets 57
9. Importing table data into Python 57
9.1. Opening a FITS table in python 57
9.2. Opening an ASCII table in python 58
9.3. Accessing astronomical data bases 59
10. Astronomical image manipulation with Python 60
10.1. FITS files and python 60
10.2. Displaying (showing) an image 60
10.3. Pixelwise operations 61
11. A simple simulation 63
11.1. Step-by-step numerical integration: Euler
method 63
11.2. Numerical errors 65
11.3. Velocity Verlet algorithm 65
11.4. A simple two-dimensional simulation 66
12. Conclusion 70
1. INTRODUCTION
Compare professional astronomy today with how it was
50 years ago, and you will recognise some continuity
but also a number of fundamental changes. Perhaps the
key change is that astronomy has come to rely almost
completely on digital data. Modern telescopes with their
CCD cameras produce digital images and, with the help
of suitable dispersive elements, digital astronomical spec-
tra. An in-depth analysis of a particularly well-studied
object will be able to make use of digital images and
spectra taken at different wavelengths some taken by
ground-based telescopes and some, like extreme ultra-
violet or X rays, which can only be provided by space
telescopes.
Furthermore, in-depth studies of selected objects are
only part of what modern astronomy has to offer. We
also live in an era of extensive surveys: large-scale under-
takings to photograph, or take spectra of, wider regions
of the sky in a systematic way. These surveys not only
produce many images and spectra, but also extensive cat-
alogues of the objects observed, listing various of their
properties. With such catalogues comes the ability to
make statistical deductions about astronomical objects:
If you want to know whether, say, elliptical galaxies are,
in general, brighter than spiral galaxies (to pick an ar-
tificially simple example), you, consult a suitable galaxy
arXiv:1905.13189v2 [astro-ph.IM] 7 Jan 2020
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47

Pré-visualização parcial do texto

Baixe Guia computacional de Astronomia e outras Manuais, Projetos, Pesquisas em PDF para Astronomia, somente na Docsity!

Version January 8, 2020 Preprint typeset using LATEX style openjournal v. 09/06/

A BEGINNER’S GUIDE TO WORKING WITH ASTRONOMICAL DATA

Markus P¨ossel Haus der Astronomie and Max Planck Institute for Astronomy

Contents

  1. Introduction 1 1.1. Types of data 2 1.2. Types of tools 3 1.3. Concepts and operations 3 1.4. Software/language choices 4
  2. Data basics: images, spectra, tables 5 2.1. Images: Colour, brightness, pixels 5 2.2. Images: PSF and noise 7 2.3. Images: Noise and flatfielding 8 2.4. Images: astronomical information 10 2.5. Spectra 11 2.6. Data cubes 14 2.7. High-level data: catalogues and tables 15
  3. SAOImage DS9 and astronomical images 18 3.1. Loading a Hubble image 18 3.2. A first look at the Eagle Nebula M16 19 3.3. Coordinates: Navigating the image 20 3.4. Meta-Information: the FITS header 21 3.5. Making a colour image 22 3.6. Catalogs 23 3.7. Photometry with regions and statistics 24 3.8. Profiles 25
  4. TOPCAT and table data 26 4.1. Opening a table file 27 4.2. Making a sky plot 28 4.3. Virtual Observatory (VO) services 28 4.4. Basic ADQL queries 30 4.5. Selections and subsets 32 4.6. More on plotting 34 4.7. Histograms 37 4.8. A quick look at a spectrum 37
  5. Getting started with Python 38 5.1. Installing Python 39 5.2. Using Python in Spyder 39 5.3. Modules 41
  6. Basic operations with Python 42 6.1. Meet your new versatile calculator 42 6.2. Units and constants 42 6.3. Random numbers 43 6.4. Strings 43 6.5. Conditions 44 6.6. User-defined functions 45 6.7. Timing your code 46
  7. Taming long data sets: Lists in Python 46 7.1. A list of galaxies 46 7.2. Doing something element by element 47 7.3. Operations involving more than one list 48 7.4. Creating lists simultaneously 49 7.5. Numpy arrays 50 7.6. Variable types, lists, arrays and speed 50 7.7. Strings and base n numbers as lists 51 8. Basic plotting with Python and Matplotlib 52 8.1. Plotting a function 52 8.2. Making a plot look better 52 8.3. Annotating plots 53 8.4. Figure size 54 8.5. Scatter plots 54 8.6. Fitting data 55 8.7. Histograms 56 8.8. Saving figures 57 8.9. Glueing data sets 57 9. Importing table data into Python 57 9.1. Opening a FITS table in python 57 9.2. Opening an ASCII table in python 58 9.3. Accessing astronomical data bases 59
  8. Astronomical image manipulation with Python 60 10.1. FITS files and python 60 10.2. Displaying (showing) an image 60 10.3. Pixelwise operations 61
  9. A simple simulation 63 11.1. Step-by-step numerical integration: Euler method 63 11.2. Numerical errors 65 11.3. Velocity Verlet algorithm 65 11.4. A simple two-dimensional simulation 66
  10. Conclusion 70
  11. INTRODUCTION Compare professional astronomy today with how it was 50 years ago, and you will recognise some continuity — but also a number of fundamental changes. Perhaps the key change is that astronomy has come to rely almost completely on digital data. Modern telescopes with their CCD cameras produce digital images and, with the help of suitable dispersive elements, digital astronomical spec- tra. An in-depth analysis of a particularly well-studied object will be able to make use of digital images and spectra taken at different wavelengths — some taken by ground-based telescopes and some, like extreme ultra- violet or X rays, which can only be provided by space telescopes. Furthermore, in-depth studies of selected objects are only part of what modern astronomy has to offer. We also live in an era of extensive surveys: large-scale under- takings to photograph, or take spectra of, wider regions of the sky in a systematic way. These surveys not only produce many images and spectra, but also extensive cat- alogues of the objects observed, listing various of their properties. With such catalogues comes the ability to make statistical deductions about astronomical objects: If you want to know whether, say, elliptical galaxies are, in general, brighter than spiral galaxies (to pick an ar- tificially simple example), you, consult a suitable galaxy

arXiv:1905.13189v2 [astro-ph.IM] 7 Jan 2020

catalogue and look up the brightness values for a large number of elliptical and a large number of spiral galaxies. Modern surveys produce considerable amounts of data. For home use, we have become used to Gigabytes (1 Gi- gabyte =1000 Megabyte) and Terabytes (TB; 1 Terabyte = 1000 Gigabyte): a DVD holds 4.7 Gigabytes, and hard drives now routinely hold a Terabyte or more. The ESO Survey Telescope VISTA produces about 150 TB worth of data per year, and the Large Synoptic Survey Tele- scope (LSST) currently under construction is predicted to produce 500 TB worth of image data per month. Then, there a increasingly large and detailed simula- tions. Take the IllustrisTNG simulation, which follows the evolution of a large portion of the universe from shortly after the Big Bang to the present. The small- est but most detailed of the TNG runs, TNG50, follows the fate of a cube that, in the present universe, has a side-length of 50 Mpc. Within this volume, matter is rep- resented by 10 billion point particles representing Dark Matter and 10 billion point particles representing gas. The two simulation runs TNG300 and TNG100 which were made available to the public in December 2018 sum up to more than one Petabyte of data (PB; 1 Petabyte = 1000 Terabytes). Increase the amount of data, and at some point it will become impractical to download a complete data set onto your own computer for analysis. This is where data base operations become important: the data is stored in ded- icated data centres, and is accessible online; in order to work with the data, you use the Internet to send specific queries (”Give me the list of all galaxies on the Southern hemisphere which are brighter than X”). In this way, the only data you download is the data you specifically need for your research. The next step is not far off: when even those pre-selected data sets become too cumbersome to handle, researchers can run their analysis programs re- motely on the dedicated servers where the data is stored. Infrastructure to allow just this, notably JuPyter note- books, are becoming increasingly common. All this implies that digital data analysis skills are part of the essential skill sets of modern astronomers. Some of the skills needed for a given research project will be very specific, involving custom software to be used for a very particular kind of analysis, or custom software to be written by the researcher herself. These special skills must be learned and honed on the job. But there are other skills which are more elementary and more gen- eral. Teaching a selection of those skills is the purpose of this text. It was originally written for interns at Haus der Astronomie in Heidelberg, in particular for partici- pants of our International Summer Internship Program^1 aimed at students in the final years of high school, or for students who have just finished high school and are about to start college. The text is meant to give a first introduction to work- ing with astronomical data. It does not cover the more detailed astronomical use cases, but instead is meant to help students familiarise themselves with the basic tools needed for such work, and learn to apply basic techniques and tools that are fairly universal.

(^1) http://www.haus-der-astronomie.de/en/what-we-do/ internships/summer-internship

1.1. Types of data When it comes to data from observational astronomy, most data sets we will be dealing with fall into one of the following categories:

  • Image data — in its simplest form, an image is a two-dimensional array of pixels, where each pixel value denotes a brightness value. In an ordinary color image, each pixel will have three brightness values, denoting the contribution from red, green, and blue (RGB). Since astronomers use many spe- cialist filters beyond these three colors, astronomi- cal ”color” images can have even more color values per pixel. Astronomical images usually show a re- gion of the night sky.
  • Spectra — simple spectra show us how the en- ergy of the light emitted by an object is distributed among the different possible wavelengths. Such simple spectra are one-dimensional: for each wave- length value, we know the contribution of light from that particular wavelength region.
  • Data cubes — think of these as an enhanced ver- sion of astronomical images. An example is a data cube from what is known as integral field spec- troscopy (IFS); such a data cube is like a two- dimensional image, but now each pixel contains not a brightness value, but a whole spectrum received from the region of the sky within that pixel. Since we have a one-dimensional spectrum for each pixel of a two-dimensional image, that gives us in effect a three-dimensional object: a data cube
  • Catalog data — on a higher level of analysis, as- tronomers make catalogues of the properties of dif- ferent types of astronomical objects. A star cata- log, for instance, could list position, proper motion, parallax, brightness (in various wavelength bands), and effective temperature for each of a specific se- lection of stars. This list is not complete — for instance, in interferomet- ric imaging, when you are trying to reconstruct an image by combining coherently the measurements of different telescopes (“aperture synthesis”), your raw data will be time-stamped data from the single telescopes, and the initial processing will involve cross-correlating the data between those telescopes. But while the list is not com- plete, it should cover the great majority of current astro- nomical use cases. We will take a first look at examples for each data type in section 2. These different data usually come with meta-data, that is, descriptive information about the data. Astronomical images typically include information about the circumstances of when and how the image was taken (what telescope, what time, what exposure time, what pointing?), and about where in the sky the object in question is located (in the shape of data allowing the user to relate image pixels to an astronomical coordinate system). For simulations, the situation is more diverse, but there are two fundamental schemes:
  • N-body simulations — here, matter is repre- sented by point particles. A point particle can rep-

This starts with basic mathematical operations. When you go from the magnitude to the flux emitted by an astronomical object, you will need the “x to the power of n” operation; on the way back, the logarithm. Whenever you perform calculations with your data, you will need the appropriate operations. Data points come in sets: the pixel data for an image corresponds to a two-dimensional arrangement, while a list of properties for astronomical objects will correspond to a one-dimensional chain of values. Programming lan- guages feature suitable data structures for this kind of connected data, such as lists, arrays, tuples, or differ- ent kinds of table (the meaning of those words can differ somewhat between one programming language and the next). Knowledge of these data types and the various ways of manipulating them is a must, along with knowledge of more basic types such as strings, integers or floating point numbers — and of course the basic concept of storing values in a variable in the first place! For operations on our data, we need control struc- tures. If we want to perform a certain operation on every element of a list, for instance, we will need some- thing like a for loop. In order to distinguish between different cases — a structure that allows us to apply a certain combination of operations to every element of a list. There will also be situations where we might want to perform a certain operation on some elements of the list, but not on specific other elements — to accomplish this, we need if clauses, that is, tools that tell our program to apply certain operations only if specific conditions are met, but not otherwise. In addition to this kind of general knowledge, which is required when learning pretty much any general pro- gramming language, astronomers should have at their disposal a set of programming tools for more specific tasks — which often equates with familiarity with par- ticular libraries or modules. Often, we want to visualize our data, so knowledge of how to create various kinds of plots, diagrams or histograms (both one-dimensional histograms and two-dimensional density plots) is essen- tial. Last but not least, how do we get data into our pro- gram, and our results out again? If we have obtained the data by downloading a file, we will need to know about proper input/output operations (in short, i/o). For certain data formats, such as the ubiquitous FITS image files that are the usual format for astronomical images, or for astronomical tables in FITS or VOTable format, there are special functions that read the data in a way that makes it particularly easy to start working with them. When we do not download the data in the form of files, but instead access astronomical data bases, there is an additional issue. We need to tell the data base which specific set of data we would like to access. In order to do so, we must submit a data base query, or query for short, to the data base: a formalized request for data, written in a specific query language. A number of astro- nomical data bases are organised in the shape of a Vir- tual Observatory (VO) — data bases that conform to certain common standards to enable easy access for all astronomers. The query language for the VO is the As- tronomical Data Query Language (ADQL), which

is similar to the more general Structured Query Lan- guage (SQL, pronounced either ”S–Q–L” or ”sequel”). Queries in this language are useful both in the context of an application software like TOPCAT, where they can used in the framework of the Table Access Protocol to download a specific subset of data from an online data base via the Internet, or as part of a Python program. There is another aspect of all this, which would require a tutorial of its own for proper treatment: data can be generated by software, too. Astronomy isn’t only about observing. In the end, we want to understand the objects we observe. That involves creating simplified models for these objects. If a star is (put simply) a gigantic ball of plasma, held together by its own gravity and heated up by nuclear fusion reactions in its core, then if we create a simulation of such an object, using our knowledge of the laws of physics, the resulting model should have similar properties to a real star (as we can check using observations). Simulations, too, require coding. In physics, only the simplest situations can be described “analytically”, that is, writing down what happens in terms of simple func- tions such as sin(x), cos(x), polynomials and the like. For more complicated situations, you will need to simulate what happens numerically: starting with the initial sit- uation, and then letting the computer reconstruct, time step by time step, what happens. We will encounter a very simple simulation in section 11.

1.4. Software/language choices Every text on data processing has the same problem: For specific applications, there is usually more than one application software, and of course there are numerous programming languages. In teaching about data process- ing, one should include specific examples, and students should work through such examples themselves, gain- ing hands-on experience with all that data processing involves. If the author chooses to present these exam- ples in one specific programming language, at least some students will later, when they are working on a specific project, need to re-learn a different programming lan- guage. This is not as bad as it sounds, though. Most pro- gramming languages, and most applications, share simi- lar concepts and functions. Once you have learned about those in the framework of one specific programming lan- guage, or application software, switching to another lan- guage or software will be much easier than starting from scratch. Thus, learning what this tutorial has to offer is definitely not a waste of time, even if it should turn out that later on you will need to adapt to other software. In this tutorial, I have chosen some common applica- tion software for simple operations: SAOImage DS (DS9 for short) is a comparatively simple image viewer that also allows some basic manipulation of astronom- ical images. TOPCAT is a standard tool for access- ing data from the Virtual Observatory. The program- ming language used for more complex tasks is Python, which is widely used in astronomy. This wide use has a great advantage: astronomers have been writing help- ful astronomy-specific libraries and modules for Python, and are actively maintaining them. If you’re starting a career in astronomy, chances are that you will do your basic programming in Python.

All that said, let’s get started. To get our bearings, we start with something simple: before we delve into astronomical Python and start coding ourselves, let us begin with two simple use cases for application software: In section 3, we will look at astronomical images and combine red, green and blue filter images into a color image. In section 4, we will look at some basic table operations with TOPCAT.

  1. DATA BASICS: IMAGES, SPECTRA, TABLES In astronomy, just as in other sciences, we are not inter- ested in data for data’s sake. We want to do astrophysics: we want to use data to further our understanding of the universe. Before we look at specific tools, and learn how to use them, let us consider some of the properties of astronomical data, as well as some of the specific ways of extracting information from them.

2.1. Images: Colour, brightness, pixels Astronomers take images of astronomical objects, us- ing telescopes and suitable instruments attached to those telescopes. Public versions of such images can be stun- ningly beautiful, and contribute significantly to the fas- cination of the general public with astronomy. The un- derlying science images are commonly stored in a for- mat known as FITS, which stands for the “Flexible Im- age Transport System” — a flexible file format that as- tronomers have been using for images, spectra, data ta- bles and more since the 1980s. When you encounter a professional astronomical image, it is likely to be in that particular format, with file extensions “.fits” or “.fit” on an older Windows machine. Figure 1 shows one of the iconic images from the Hubble Space Telescope, namely the open cluster Westerlund 2.

Fig. 1.— Image of the open star cluster Westerlund 2, taken by the Hubble Space Telescope. The image data was downloaded from spacetelescope.org. Image credit: NASA, ESA, the Hubble Heritage Team (STScI/AURA), A. Nota (ESA/STScI), and the Westerlund 2 Science Team

Using this image as an example, we can demonstrate a number of properties of astronomical image data. Phe- nomenologically, the image contains two types of infor- mation: the stars we see in the image are much too small

for even Hubble to see any of their structure. They ap- pear as point sources. In addition, we have extended sources, in this case a region of ionized hydrogen (HII, in astronomical parlance), which are, as the name says, extended areas with varying brightness and colour. While we tend to think of astronomical images as a rendition of “what’s up there in the sky,” there are sev- eral aspects in which such images are not faithful rendi- tions — and those aspects are crucial for understand- ing astronomical image data. Let us start with the colours. Professional astronomical images are black- and-white images. One reason for this is that digital cameras are, at their most basic, black-and-white. For each pixel, they can only record how much light has fallen onto the collecting area for that pixel (more specifically: how many photons have fallen). Consumer cameras as in your smartphone or your digital camera are only able to produce colour images because they have an array of filters installed in front of the array of light-detecting sen- sor pixels. A common pattern is the Bayer mask, part of which is shown in figure 2.

1

Fig. 2.— Part of a Bayer mask pattern: an array of filters in- stalled in front of detector pixels to enable the quick creation of a colour image

When such a consumer camera has taken an image, the colour information is interpolated, and a colour image is displayed and saved. (If you have a camera that can save images in some kind of “raw” mode, you can see the not-yet-interpolated pattern.) For astronomical images, a fixed filter mask is imprac- tical for several reasons. Astronomers would like to get the full resolution for their images, so reducing resolution in each color band by having information about the green brightness in every second pixel only, and about red and blue in every fourth pixel, is a drawback. Colour inter- polation means that some of the colour information gets lost. Also, astronomers use a bewildering array of possi- ble filters, not just those corresponding to red, green, and blue (R, G, B) — some of those filters capture a wider wavelength range, while narrow-band filters might cap- ture just a particular spectral line. Astronomers need the flexibility of putting these different filters in front of their camera. On the plus side, most astronomical objects change only very slowly. Using different filters in succession, taking an image first through one filter, then through the next, is perfectly feasible; those im- ages will all show the target object in effectively the same state. (On the rare occasions where speed is of the essence, astronomers will use something like paral- lel cameras observing through different filters. A case in point are gamma ray burst afterglows, which fade on a

Fig. 4.— Part of the Westlund 2 image taken through an R (red) filter in April 2017 with the 2 m Faulkes Telescope operated by Las Cumbres Observatory at Siding Spring in Australia with different scaling. Left: Linear scaling from 0 to 65536. Center: Linear scaling from 4572 to 6002. Right: Square scaling from 4572 to 6002

eral images have been stitched together to form a larger picture. The beautiful Hubble version of Westerlund 2 in figure 1 is a case in point, as it is a composite image using observations with Hubble’s Advanced Camera for Surveys (ACS) and its Wide Field and Planetary Cam- era 3 (WFPC3). Figure 5 shows a sample WFPC3 image (although probably not one used in the final composite^4 ) pasted into the final colour image to give you an idea of

Fig. 5.— Image credit: NASA, ESA, the Hubble Heritage Team (STScI/AURA), A. Nota (ESA/STScI), and the Westerlund 2 Sci- ence Team

the footprint of the WFPC3. In fact, the WFPC3 in- set is already a blend of four images, created from the three image chips (CCDs) of the Wide Field Camera (the three larger squares) and the image chip of the Planetary Camera (smaller square nestled into the corner formed by the other three). As a next step, let’s zoom in into the WFPC3 im- ages, more concretely: into one of the Wide Field Cam- era squares. The result can be seen in figure 6. There are several points of note. The first is that the image is made up of discrete square fields: pixels. That is no surprise if you have ever looked very, very closely at dig- ital photographs. It also means that, at the lowest level,

(^4) I didn’t find any of the original WFPC3 images in the MAST archive; all I could find were already (smaller) composites.

Fig. 6.— Zoom in on the WFPC2 image of Westerlund 2. Image credit: NASA, ESA

working with digital images means working with pixels, and with the brightness value associated with each pixel. Pixel positions are described by a pair of (integer) co- ordinates for each pixel. A schematic example is shown in figure 7.

Figure 5: Image credit: NASA, ESA, the Hubble Heritage Team (STScI/AURA), A. Nota (ESA/STScI), and the Westerlund 2 Science Team

Figure 6: Zoom in on the WFPC2 image of Westerlund 2. Image credit: NASA, ESA

1 2 3 4 5

1

2

3

4

5

Figure 7: Pixels and pixel coordinates

14

Fig. 7.— Pixels and pixel coordinates

The brightest pixel in the 5 × 5 array would have the coordinates (4, 3), since it is in the fourth column from the left, and in the third row from the bottom. (Beware, other conventions exist! Some count rows from the top. Some start the count at number 0, not number 1.)

2.2. Images: PSF and noise Back to figure 6. The disk-shaped bright objects in the image are stars. Here’s the thing: Westerlund 2 is at a distance of about 20 000 light-years from us. At that distance, even an especially large super giant with 1500 solar radii should subtend an angle of a mere 0. 002 ′′. Each pixel in the WFPC2 image has a side length of

  1. 1 ′′. If our image were a faithful map showing the exact direction whence light reaches us from the sky, even these largest known stars would fall within a sin- gle pixel. Instead, they and the much more common markedly smaller stars are smeared out and appear in the image as disks. Figure 8 shows a brightness profile of the star at the bottom center of figure 6. The disk is a few pixel wide. Why the disk? Why not a single pixel? The answer, as you probably know, is that light is a wave phenomenon, and that a wave passing through an opening — in this case, the aperture of the telescope — is diffracted. The result is a diffraction pattern that makes the image of a point source a disk (if you were to look very closely, a disk with concentric rings around it). The larger the telescope aperture, the smaller the disk — which is one key reason to build ever larger tele- scopes: ever better resolution for the resulting images. The function that defines the brightness distribution that results when a telescope instrument produces an image

10 5 0 5 10 Pixel value

0

200

400

600

800

1000

Brightness value

Fig. 8.— Brightness profile of the bright star near the bottom center of figure 6

of a point source is called the point-spread function, ab- breviated PSF. Last but not least, look at the part of figure 6 that is not stars, but background. The background is not uni- formly black, but mottled grey — a section, shown at even larger magnification, can be seen in figure 9. There

Fig. 9.— Zooming in on part of the background of figure 6

are several reasons that some of the pixels are brighter, others less bright. One is the presence of distant, un- resolved astronomical objects. But those cannot explain the small-scale variation from pixel to pixel — remember that even a point source would appear as a smeared-out disk! Instead, the variability is noise — a spurious ad- dition that tells us nothing about the astronomical light sources out there. The most fundamental effect is one of statistics: light reaches our detectors in the form of photons, of light par- ticles. The intensity of light reaching us from a specific source determines the probability of a photon arriving within a certain time interval. But the arrival itself is a random event. (You probably know a similar situa- tion: radioactive decay, where the decay probability per unit time is constant, but each decay will still occur at an unpredictable random time.) This randomness trans- lates into pixel brightness fluctuations. The relative size of these fluctuations shrinks as the total number of pho- tons collected grows. This is the other key reason why astronomers want large telescopes (and in most cases still need long exposure times, in addition): the more light they can collect from a distant object, the smaller the

relative fluctuations, the higher the signal-to-noise ratio, and thus the clearer the image of those distant structures. There are other kinds of noise. If you inspect raw, unprocessed images taken with the Hubble Space Tele- scopes, you will find one kind that is typical for space telescopes: traces left by cosmic particles depositing their energy in the detector, leading to either longish streaks or more sharply defined dots, depending on the direction the particle was travelling. Figure 10 shows an exam-

Fig. 10.— Traces of cosmic ray particles in an image taken with the Hubble Space telescope. Image credit: NASA and ESA

ple (albeit from Hubble targeting a different object, the Eagle nebula). Also, there is noise from the electronic de- vices involved (although cooling key electronic elements down can reduce that kind of noise considerably).

2.3. Images: Noise and flatfielding When astronomers prepare data for the extraction of astronomical information, a process commonly called data reduction, there are several typical steps they take in order to reduce both the noise produced in their instrument and the instrument’s idiosyncrasies when it comes to recording brightness.^5 The image chips consist of little pixel elements; light falling onto a pixel sets free some electrons. In a CMOS, each pixel also contains the electronics, including a little amplifier, to read out a signal that indicates the number of electrons, and thus the amount of light. In a CCD camera, the read-out process is more involved, and in- volves herding electrons to the end of each pixel row, then moving them to an amplifier. In both cases, ideally, the number of electrons will be in direct proportion to the number of photons that have fallen onto that pixel dur- ing the exposure time. And in the end, those electrons are dumped onto a capacitor, whose voltage is measured. Since the voltage across a capacitor is proportional to the accumulated charge, the result gives us a measure of the number of electrons, and thus of the amount of light we have captured. The analog voltage value is fed into an analog-to-digital converter (ADC) which converts the voltage value into an integer digital number, correspond-

(^5) In preparing this section, I have profited from two on- line sources: The lecture notes by S. Littlefair for the course PHY217 – Observational Techniques for Astronomers he taught in 2014, http://slittlefair.staff.shef.ac.uk/teaching/phy217/, last access 2019-11-01, and parts of the e-book by R. A. Jansen, Astronomy with Charged [sic] Coupled Devices (2006), http://www.public.asu.edu/˜rjansen/ast598/2006ACCD.ebook...1J.pdf, last access 2019-11-01.

formation, we can correct for the sensitivity variations as follows. Assume that a pixel in the master flat is twice as bright as a second pixel. That would mean our setup is only half as sensitive for the second pixel than for the first. But if we were to take an image, and then to mul- tiply the brightness value for the second pixel with the factor two, we would have compensated for the different sensitivity levels. More generally, we can restore proper relative bright- nesses of all our pixels by dividing our science frame by our master flat, pixel by pixel. The result for our holi- day snap can be seen in Fig. 13. The compensation is

0 1000 2000 3000 4000 5000 6000

0 500 1000 1500 2000 2500 3000 3500 4000

Fig. 13.— Flatfield-corrected holiday snap

not perfect. For instance, if fewer photons have reached a certain detector pixel, then the statistical noise will be somewhat larger for that pixel. Dividing the pixel value by a factor, as one does in flatfielding, will not get rid of that additional noise. All in all, we have learned the basics of how as- tronomers are reducing their image data — compensat- ing for noise that is added to each pixel by subtracting suitable compensation terms (bias frame, dark frame, sky frame), and afterwards compensating for sensitivity vari- ations by dividing by a suitable compensation term (flat- field image). To sum up the last few sections: The digital astro- nomical images used by astronomers are made of pixels; what we see is in part determined by the properties of our target object, but in part by the properties of the optical system used (telescope plus instrument and their PSF), and in part by noise. The “elementary images” are black-and-white, and usually taken through a specific filter. When such images are displayed, additional deci- sions were involved about how to represent brightness. Published images frequently use colour to convey addi- tional information — although in most cases, these are false colour images, which do not reproduce the colour of the object we would perceive could we view it directly. Astronomers employ dark frames and flatfielding to re- duce certain types of noise, and of sensitivity variations. Sometimes, images are fit together to form a larger mo- saic. When working with images, we need to keep all this in mind — after all, we want to use the information con- tained in the image to make deductions about the astro- nomical objects observed. To do that, we need to know which aspects of the image really do contain informa-

tion about the object — and not information about the telescope-instrument combination, or photon statistics.

2.4. Images: astronomical information So far, we have talked mostly about image artefacts — what makes an image different from the real thing. Time to talk about the physics behind it all: What in- formation is contained in astronomical images? The in- formation important for classical astronomy, for a start: Images contain position information about astronom- ical objects, information about where exactly an object is located in the sky (for object whose position does not change in the usual coordinate system), or about how its position changes over time. In the era of classical astronomy, this was the main pur- pose of observatories: determining the positions of stars in the sky, as an aid to celestial navigation. This also included measurements that allowed for precise time- keeping: until the advent of stable quartz clocks in the mid-20th century, documenting the periodic changes in the night sky, in particular the diurnal motion during one (sidereal) day, was the most accurate time-keeping method. In modern astronomy, determining stellar positions re- mains an important sub-field, which for the last few years has been dominated by ESA’s astrometry satellite Gaia. Accurate catalogues of stellar positions not only provide a framework for localising astronomical objects in gen- eral. Via the parallax effect, they also provide informa- tion about the distances of stars in our cosmic neighbour- hood, which in turn is a prerequisite for farther-reaching methods of astronomical distance determination. Knowl- edge of astronomical distances is crucial for making de- ductions about object’s luminosity. (In principle, an ob- ject that appears to be bright in the night sky could have a rather faint luminosity, but appear bright since it is very close to us, or else have a really high luminosity while being rather more distant.) Then, there is photometry, that is, determining the (apparent) brightness of astronomical objects. If we have chosen proper exposure times, and made all necessary corrections, we could deduce the brightness of a star from the sum of the values of the pixels associated with that star. In practice, it’s difficult to separate star pixels from non-star pixels, but there is a simpler way known as aperture photometry: Define a circular region that contains the PSF of the star completely (the yellow cir- cle in Fig. 14). At some small distance outside, define an annular region (bounded by the two blue circles in Fig. 14). Assuming that the central circle contains only the star we are interested in, and the annular region no discernible star at all, we can argue as follows: If we sum up the pixel values within our central circle, we get the light from the star plus background light. We can esti- mate the background light as follows: On average, the background brightness should be the same in the central circle and in the surrounding annulus. In the annulus, we can determine the average brightness by summing up the pixel values in the annulus and then dividing by the num- ber of pixels (equivalently, by the area of the annulus). Multiply this average by the number of pixels in the cen- tral circle (equivalently, by the area of the circle), and the result will be the brightness contribution from the background within the central circle. Subtract this con-

1

Fig. 14.— Aperture photometry in a part of the Hubble Space Telescope image of Westerlund 2. Image credit: NASA, ESA, the Hubble Heritage Team (STScI/AURA), A. Nota (ESA/STScI), and the Westerlund 2 Science Team

tribution from the total sum of the pixel values within the central circle, and what is left is a measure of the brightness of the star. Note that photometric measure- ments are sometimes made with the telescope slightly out of focus, distributing the object’s light over a greater number of detector pixels for greater accuracy. In astronomical practice, stars are point-like objects. For extended objects, we can measure a surface bright- ness, given in brightness per angular area in the sky, and we can measure how that brightness varies from location to location. Such brightness maps contain information about the amount of material we are seeing. The situa- tion is more complicated when densities are so high that some of the matter obscures our view of whatever matter lies behind (that is, if the matter in question is “optically thick”). In the simplest case, we can see all of the light from the matter of, say, a nebula (the nebula is “opti- cally thin”), and the brightness in a certain area of the sky allows us to estimate how many atoms we are seeing in that area — a column density since we cannot de- duce the three-dimensional structure, only the number of atoms within that column of the three-dimensional ob- ject which gets projected to the sky-region in question. Brightness measurements will only ever cover some limited region of the electromagnetic spectrum. Some of the limitation comes about by the kind of telescope we use. An ordinary optical telescope will be able to receive visible light, near-infrared light, and ultraviolet light (which, for ground-based telescopes, is somewhat pointless since almost all UV light is filtered out in the Earth’s atmosphere). But its camera would not be able to detect, say, mid-infrared light, let alone X-rays. In practice, as we have already seen, astronomers voluntar- ily restrict themselves to even narrower portions of the spectrum, by using suitable filters. This allows for quan- titative description of the colors of astronomical objects. An object that is bright when viewed through a blue fil- ter, but dim when viewed through a red filter, will be blueish in color. What we have called the brightness so far, summing up pixel values in our image, is proportional to the number of photons from a certain source (or a certain area of the sky) entering our telescope during the exposure time. Since the exposure time is the same for all the objects

in our image, the ratio of brightness values for two such objects is equal to the ratio of the energy per unit time (in the given filter band) we receive from those two objects. Also, since in both case we are using the same tele- scope, and hence the same collecting area, the ratio is equal to the ratio of the energy per unit time per unit area (again in the given filter band) for those objects, or using the appropriate technical term: the ratio of their fluxes. Flux ratios are how astronomers traditionally compare the apparent brightness of celestial objects — except that, to ensure some degree of backwards-compatibility with the naked-eye-based, 2000-year-old Ancient Greek magnitude system (as one does), those ratios are mea- sured on a logarithmic scale. Specifically, if F 1 and F 2 are the flux values for light we receive from two objects 1 and 2 in a specific filter band, then their apparent magnitudes in that band are defined as

m 1 − m 2 = − 2. 5 · log

F 1

F 2

A reference point for the magnitude system is chosen by setting a value for the magnitude of a specific star in a specific filter band; for instance, in the V filter band that roughly corresponds to a green filter, the star Vega was originally chosen as a zero point, although his modern visual magnitude is mV = +0.03. Note the minus sign — magnitude values are larger for fainter stars. With the naked eye, under good conditions, you can observe stars with magnitudes up to about mV =

    1. For extended objects, we would need to document the energy per unit time and unit collecting area that reaches us from a given solid angle in the sky. There, the intensity, as energy received per unit time per unit area per unit solid angle is the appropriate descriptive quantity. Brightness, of course, can change. Different types of variable stars, for instance, can be distinguished by the shapes of their light curves, which document how their brightness changes over time. The transit method for detecting exoplanets also relies on light-curve measure- ments. Last but not least, images also contain spatial, 3D in- formation about the objects under study. Typically, that information is projected onto the sky — we do not see the full three-dimensional structure of, say, a gas cloud; instead, we have one particular fixed perspective on that cloud. Interpreting what we see typically involves mod- els for the physical, three-dimensional structure, whose predictions can then be compared to what we actually observe.

2.5. Spectra On to a central kind of data set in astrophysics: spec- tra! A spectrograph contains a dispersive element (or even more than one), which splits the incoming light into its rainbow colours or, in physics terminology, into its dif- ferent wavelengths. Examples for the three basic types of dispersive element an be seen in Fig. 15. Figure 16 shows ceiling lamps, imaged through a dispersive grid, namely through “spectral glasses” that can be used to demon- strate dispersion effects. As you can see in the figure, though, the spectral decomposition makes for a hodge-

much smaller wavelength interval, as in figure 20, you can see the the lines themselves have characteristic shapes.

515 516 517 518 519 520 Wavelength [nm]

relative Flux

Fig. 20.— Narrow region from a solar spectrum. Data from the IAG Solar Flux Atlas, Reiners et al. 2016

When working with raw data from spectra, certain re- duction steps need to be taken. Some of those are simi- lar to the reduction of image data: Flatfielding is again needed to compensate for differing sensitivity of the in- strument in different parts of the spectrum. This is more difficult for a spectrum than for an image, since for a true flat field, you would need a perfectly flat spectrum. Instead, any well-known, preferably smooth calibration spectrum can be used to deduce the varying sensitivity. Dark frames again can be used to take into account that the electronics of the detector will produce some spurious brightness in the image, which needs to be subtracted. Wavelength calibration is another necessity. After all, the spectral spread has a specific meaning — light is sep- arated according to wavelength (or frequency). In order to map specific wavelengths to the direction along which dispersion takes place, astronomers often employ specific calibration lamps, which contain a gas or a mixture of gases that produce a hopefully dense array of known emission lines. For a simple amateur spectrograph, you might use Neon for the purpose; at the professional level, you might for instance find a mixture of Thorium, Ar- gon, and Neon. Sometimes, the calibration lines will be recorded separately; in other cases, they are recorded concurrently with the astronomical observation to allow for a direct comparison. A special case of the latter, un- avoidable for ground-based telescopes, are telluric lines — absorption lines created not in outer space, but by light absorption in the Earth’s atmosphere. Such telluric lines can be used for calibration, as well. Image distortions can make spectral data reduction particularly challenging. A particularly complex case are Echelle spectrographs, where two kinds of spectral dispersion are combined: A grating will, in fact, pro- duce several different spectra, called “spectra of different order”. Higher orders tend to overlap each other, but the different overlapping partial spectra can be separated by a second stage of dispersion in the direction orthog- onal to the initial dispersion. The result are different rows of partial spectra, allowing astronomers to capture an entire high-resolution spectrum on a standard, two- dimensional camera chip. The raw image of one such Echelle spectrum, taken with the FEROS spectrograph at the MPG/ESO 2.2-metre telescope at ESO’s La Silla observatory, can be seen in Fig. 21. This is just a small

Fig. 21.— Region within a raw image of a spectrum of the star HIP66974 and a calibration lamp, taken with ESO’s FEROS spectrograph in June 2015 (Data set SAF+FEROS.2015- 06-13T23:16:46.772). Retrieved from ESO’s Science Archive on 24 April 2019

region, about 10%, of a much larger image. As you can see, the horizontal, curved stripes always come in pairs: the stripe on top is mostly white with some dark absorp- tion lines (which increase lower in the image), while the lower stripe consists of a fairly dense forest of emission lines. The upper stripe of each pair is the science image, in this case of the star HIP66974, a star with the same spectral type (G2V) as the Sun and thus a fairly similar spectrum. The lower stripe in each instance is the calibration lamp — hence the many emission lines, each marking a well-known reference wavelength. Reducing this spec- trum would mean to map the different stripes to their proper wavelength regions (using the calibration lines), unbending the curved image, and properly calibrating the brightness over the different parts of the image. Once the spectrum is reduced, or if one is working with a reduced spectrum in the first place, the spectrum as a whole and in particular the spectral lines contain a wealth of information about the object in question. Systematic Doppler shifts in the spectral lines indicate whether or not the light source is moving towards us or away from us. Doppler shifts that change periodi- cally over time contain information about objects orbit- ing each other, from double stars to exoplanets detected by the radial velocity method. Simple data analysis in these cases proceeds by fitting the individual spectral lines, finding their central wavelength, and tracing the changes of that wavelength over time. The shape and relative depth of spectral lines of a star contains information about the star’s metallicity, that is, the fraction of elements heavier than Helium contained in the star’s atmosphere, about the surface gravity and about the effective temperature. Some specific spectral lines corresponding to radioactive elements can be used to reconstruct the age of stars, and have been used to find the oldest stars in existence. The simplest part of such an analysis is about identify- ing the lines corresponding to specific chemical elements; these lines show which elements are present in the star’s atmosphere. Some such lines are indicated in figure 19. In all of these cases, analysis usually proceeds by creating spectra based on suitable models and comparing those with the actual observations, finding the best fit.

2.6. Data cubes So far, we have talked about two-dimensional images (where the two dimensions correspond to an area on the night sky) and one-dimensional spectra (where the one dimension corresponds to wavelengths). Data cubes are the combination of this: We have a two-dimensional im- age of the night sky, but at each pixel location, we have not only a single brightness value, but instead a whole spectrum. With two plus one dimensions, we are effectively look- ing not at a two-dimensional rectangle, but a three- dimensional cube. A data cube does not contain all the information reaching us from a certain region of the sky at a certain time (polarisation information is missing), but it comes impressively close. One way of obtaining such a data cube is with Inte- gral Field Spectroscopy — for instance: splitting an image into comparatively large “pixels,” each of which is channeled into a glas fibre which transmits its light to a spectrograph, where the spectrum is then recorded. Another natural way of recording such a data cube is in interferometry, where the light from several telescopes is combined in a coherent way, making use of the wave na- ture of light. In reconstructing images from interferomet- ric measurements, one can distinguish (to a certain de- gree) between contributions with different wavelengths; in effect, this allows for the reconstruction of a three- dimensional data cube. Human beings are not equipped for really three- dimensional vision. What we call three-dimensional vi- sion is really just seeing surfaces within (sparsely popu- lated) three-dimensional space. We cannot see all the points within a three-dimensional data cube at once. Fig. 22 shows one solution: showing separate images for

1.41630 GHz 1.41633 GHz 1.41635 GHz 1.41637 GHz

1.41640 GHz 1.41642 GHz 1.41645 GHz 1.41647 GHz

1.41650 GHz 1.41652 GHz 1.41655 GHz 1.41657 GHz

1.41659 GHz 1.41662 GHz 1.41664 GHz 1.41667 GHz

Fig. 22.— 16 of the 72 channels recorded for the galaxy NGC 3198, from the THINGS survey (Walter et al. 2008), [http://www.mpia.de/THINGS/]

different regions within the spectrum. In this particular

case, the “channel map” shows 16 of the 72 frequency bins around the 21 cm hydrogen line that is character- istic for atomic hydrogen (that is, hydrogen atoms; not bound into hydrogen molecules, not ionized to form a plasma, just simple atoms). One of the most interesting applications of data cube data is to extract the information they contain about the large-scale motion of matter. Fig. 22 shows 21 cm radiation emitted by hydrogen atoms, but some of that radiation is shifted to lower and some to higher frequen- cies — why? Because some of the atoms are moving to- wards us, others away from us, and their 21 cm radiation undergoes a corresponding Doppler shift. The data cube contains information on the radial ve- locity of the gas we see in the different frequency chan- nels. We can combine that information to make a color picture whose color encodes the average radial velocity of gas in each region of the image, giving what is called a first moment map. That picture is shown in Fig. 23. Reddish regions are moving away from us, blueish re-

200 300 400 500 600 700 800

200

300

400

500

600

700

800

Fig. 23.— Velocity map for the galaxy NGC 3198, from the THINGS survey (Walter et al. 2008), [http://www.mpia.de/THINGS/]

gions towards us. The combined picture is that of a disk galaxy whose stars are rotating as a whole, one side of the disk coming towards us and the other moving away from us in that coordinated motion. Alternatively, instead of taking the average, we could compute the standard deviation of the velocity values as- sociated with each pixel. That would give us an estimate not of the bulge motion of gas in that region, but of the diversity of motion, the spread of radial velocities. In a similar manner, we can use data cube informa- tion to map all those quantities that can be derived from spectra — the presence of specific elements and thus the chemical composition, on larger scales the prevalence of different kinds of stars, and more.

103 104 Temperature in K

10 1

101

103

105

Luminosity in

L

Fig. 26.— Hertzsprung-Russell diagram using physical quanti- ties (temperature and luminosity) instead of spectral classes and luminosity classes. Data from DEBCat

tween different populations of data points. For instance, let us colour the points in the upper-left cloud on the Hertzsprung-Russell diagram red, as Fig. 27. So far, the

103 104 Temperature in K

10 1

101

103

105

Luminosity in

L

Fig. 27.— Hertzsprung-Russell diagram using physical quanti- ties (temperature and luminosity) instead of spectral classes and luminosity classes. Data from DEBCat

colouring hasn’t brought us any great advantage. The cloud was apart from the rest before, and it is apart from the rest now. But let us carry this color scheme over to a histogram, for instance, plotting histograms for the red and the blue dots side by side, using the same bins. The result for a histogram of radii is shown in Fig. 28. Now

10 1 100 101 102 Radius in R

0

20

40

60

80

Fig. 28.— Separate histograms for the blue and red data points from Fig. HRDiagramRedBlue

we see that the stars corresponding to those red dots

have considerably larger radii than their main sequence counter parts. The size distributions are clearly sepa- rate. Those red-dot stars are veritable giants! We know from the Hertzsprung-Russell diagram that their temper- atures are somewhere between 4000 and 6000 K, going from reddish to yellowish. So these red giant stars were named with excellent reason. Let’s look at a diagram plotting, say, radius against density, as in Fig. 29. In that diagram, it is not clear

100 101 Radius in R

10 5

10 4

10 3

10 2

10 1

100

Mean density in

Fig. 29.— Plotting radius against density, both in solar units. Data from DEBCat

which are the main sequence stars and which are the red giants. With the red-blue distinction, the situation becomes clear, as shown in Fig. 30. Red giants are not

100 101 Radius in R

10 5

10 4

10 3

10 2

10 1

100

Mean density in

Fig. 30.— Plotting radius against density, both in solar units. Color marks red giants vs. main sequence stars. Data from DEB- Cat

only generally larger in radius than main sequence stars, they are also considerably less dense. Our data points us in the right direction: in the modern view, main sequence stars go through a red giant phase after they exhausted the hydrogen fusion fuel in their cores, their atmospheres swelling up and cooling down in the process, leading to a large, reddish star with drastically reduced mean density. We can use colour more quantitatively than just to ex- press class membership in a two-colour scheme. Colour can add an (imperfect) third dimension to our diagrams. In the version of a mass-luminosity diagram shown in Fig. 31, each data point has the proper star color cor- responding to it’s temperature (as determined from the star’s spectral properties^9 ). This color-coding immedi- (^9) Information about this kind of color mapping can be found on http://www.vendian.org/mncharity/dir3/starcolor/

ately allows you to identify the red giants, see that their mass range is a subset of the mass range of the main se- quence stars, but that the red giants are larger and more reddish. Color scales can also be artificial, and different

10 1 100 101 Mass in M

10 2

100

102

104

Luminosity in

L

Fig. 31.— Mass-luminosity diagram with data points plotted in the color corresponding to a star’s temperature. Data from DEBCat

color maps are available for the purpose. In Fig. 32, the color now indicates the radius of each star, with the scale shown by the colorbar on the right. Clearly, stellar radii

10 1 100 101 Mass in M

10 2

100

102

104

Luminosity in

L

100

101

Radius in

R

Fig. 32.— Mass-luminosity diagram with data points plotted using an artificial color map that indicates the stars’ radii. Data from DEBCat

grow along the main sequence, but the red giants, in their little cloud of data points above the main sequence, are larger still. A good color map can make your diagram much easier to understand; a bad one can be confusing. Also, you should take into account accessibility issues. Your color maps should be accessible even to people with certain forms of colour-blindness.^10 Unless you are plotting a spectrum, avoid the rainbow colour map. Instead, con- sider color maps that have been designed to be accessible for those with colour blindness, as well as to print well in black and white — for instance the Viridis family of colour maps. Patterns in such diagrams indicate interesting relation- ships. Is there a linear relation — do data points for cer- tain physical quantities fall on a straight line in a linear diagram? Or is there a power law at work, y ∼ xa, in which case the data points would fall on a straight line in a log-log diagram? In this way, diagrams can help us find systematic relations between our data.

(^10) Some information on this kind of accessibility can be found on [https://betterfigures.org/2015/06/23/picking-a-colour- scale-for-scientific-graphics/]

This is not as straightforward as it sounds, of course. In a two-dimensional diagram, we can plot at most three different quantities (if we make clever use of a color map). We could try all different pairs of physical quantities rel- evant for the situation we are looking at, and might get lucky in finding interesting relationships in that way. A three-dimensional diagram is possible, but would need to be interactive so we can view it from all different sides to get a feeling for the three-dimensional structures. Of course, the basic physical quantities can be combined to yield compound quantities. Complex relationships be- tween quantities, longer polynomials involving several quantities for instance, or differential/integral relations, are much less straightforward to read off such diagrams. The typical way of extracting information about sys- tematic relationships from data is to fit a function to the data. Assume that we have data points (xi, yi) for i = 1,... , N , each representing a pair of quantities. A common measure for how well those data points satisfy a general relationship y = f (x) is as follows. If the relationship were to hold perfectly, then we would have yi = f (xi). In real life, functional relationships are not that perfect. Even in cases where the relationship y = f (x) is the basis for our set of data points, mea- surement errors will lead to deviations. Moreover, ex- act relationships are rare; the much more common case is that the relationship is approximate, and that data points scatter around the curve y = f (x). For a single data point, the quantity

∆yi = yi − f (xi) (2)

is a measure of the deviation of the data point from the relationship. What is the best way of summarising the deviations for our data set as a whole? We can say what is definitely not a good measure: taking the sum of all the ∆yi, since deviations may be positive and negative, and the sizeable deviations associated with different data points could cancel each other out, skewing the result — we could even get an overall measure of zero, indicating no deviation, in a situation where the ∆yi are huge, but cancel pair-wise! To avoid this, we could take the sum of the absolute values |∆yi| but as we shall see later on, it is useful for the measure we choose to be differentiable. That why a better choice is the sum over the squares of the devia- tions,

S ≡

∑^ N

i=

[yi − f (xi)]^2. (3)

Commonly, we have an idea for the basic properties of the function f (x), but not about the explicit form of the function. For instance, we might have reason to believe the function to be linear (since that is what an x-y di- agram suggests), f (x) = ax + b, but do not know the values for a and b. In such a case, we can use the quantity (3) to find the best fit. Let us make explicit that the function depends on the parameters a, b, namely as f (x, a, b) = ax + b. For our set of data points and for any given pair of values a, b, we have

S(a, b) ≡

∑^ N

i=

[yi − f (xi, a, b)]^2. (4)

bit of which is shown in Fig. 34. You will probably need

Fig. 34.— Search results from the Hubble Legacy Archive

to scroll right to see the columns 13 and 14 we are in- terested in. In column 14, called “Dataset”, look for the data set names

  • hst 05773 05 wfpc2 f502n wf
  • hst 05773 05 wfpc2 f656n wf
  • hst 05773 05 wfpc2 f673n wf

At the time of this writing, it is fairly easy to find these images. If you don’t, you can try something different: In the search field on [https://hla.stsci.edu/hlaview.html], click on “Advanced search”. In the main field, enter “M16”, and in the proposal ID field, enter “05773”. The result will be a much shorter list, including the images listed above. Even in the far future, when the interface might have changed, searching for the proposal ID in addition to the object name should return a list that includes those images — even future archives should be “legacy-proof.” The images in question were all taken on April 1, 1995, and belong to one of the most iconic Hubble images: the pillars of creation. Each file should be around 53 MB in size. You can download the files by either clicking the little shopping cart icon for each image and then going to the shopping cart tab, or by right clicking on the shopping cart icon and choosing “save link as”. If you know astronomical abbreviations, you will be able to make some initial sense out of these dataset names: hst, for instance is bound to mean that we are downloading data from the Hubble Space Telescope. WFPC2 is the “Wide-Field and Planetary Camera 2” on that telescope, wf says that we are downloading the wide-field camera images. f502, f656 and f673 denote dif- ferent filters which have been placed before the camera for these respective images. We will combine the three images into a colour image — but it is going to be a false- color images, since those three filters do not correspond to red, green, and blue!

Last but not least, when you have downloaded the im- ages, you will notice that the filename extension indicates that these are FITS files, in the most common format used for scientific images in astronomy; the filename ex- tension is either “fits” or possibly if you are on an older Windows machine, “fit”.

3.2. A first look at the Eagle Nebula M Now that we have the files safely stored away, we can open them using DS9. To this end, go to the main menu row and click on “file” (which is probably highlighted to begin with); from the secondary menu that appears di- rectly below, choose “open”. In the usual pop-up choose- a-file window, I’ll choose the first of the Hubble files we downloaded, hst 05773 05 wfpc2 f502n wf drz.fits. Once the file is open, the DS9 window should look as in Fig. 35.

Fig. 35.— Opening an HST Eagle Nebula image in DS

The image is disappointingly black. We need to find a better brightness scale to see what is going on. Astro- nomical images typically capture an amazing dynamic range, that is, an amazing range of different brightness values for each pixel (concretely, 65536 different bright- ness values per pixel, compared with the 256 of a typical RGB pixel). Displaying such an image on a computer monitor, or printing it, can never do this range full jus- tice. Instead, we need to pick and choose — which part of the brightness range do we want to display, and which way of compressing the brightness scale shows us the most information about the image? There is no single right way of doing this, and there is no standard way that will guarantee the best results for all possible astronomical pictures. Instead, this is a matter of combining experience (your own and that of others!) with some experimentation to arrive at a result that works for you. You should always be aware that such a result is not a naked view of the astronomical data — what you see is determined both by the astronomical

data and by the choices you have made in displaying that data. (Also note that these display options do not change the image itself; the image file itself is unchanged, and you are only changing the way you are viewing the data.) To experiment a bit, go to the main menu and choose “scale”. From the secondary menu that will come up, choose “zscale”. Where previously (in “minmax” mode) the image had been displayed with the minimum pixel brightness set to black, and the maximum pixel bright- ness to white, the colors are now mapped to values closer to the median pixel brightness. As a result, your image should look something like this in Fig. 36. This still looks

Fig. 36.— HST Eagle nebula image in DS9 with zscale applied

fairly raw, and rather different from the pretty astron- omy pictures you see in the media. But at least we can discern some structures. The main window below the two rows of menus only shows part of the image. In the overview window (second to right, on top) you can see the whole of the image. The cyan frame in the overview window marks the part of the image that is visible in the main window. Drag it around (left-click the mouse and drag) to explore other parts of the image. Alternatively, you can go to “zoom” in the main menu and choose one of the options in the secondary menu to see the image as a whole (“zoom fit”), or in more detail (e.g. “zoom 2”). In the “scale” menu, you can also choose another com- pression method instead of linear — see how it affects the view! (Again, the image itself is not changed by your choice.) Also, instead of the grayscale display, you can go to the main menu point “color” and select another color map.

3.3. Coordinates: Navigating the image When your cursor is on the main image, what you will see will be something like in Fig. 37. In this image, the tip of the cursor is placed on a star. The detailed image (inset image top right) shows the star’s little disk, with DS9’s own representation of the cursor on top. Let us

Fig. 37.— HST Eagle nebula image with cursor on a star

take a closer look at the information on the top left. “File” is simply the file name. If the header identifies the object by name in a suitable way, that is what would be displayed in “Object.” The “Value” field gives you the value of that particular pixel. Below, we have the sky position information for the cursor, given in the “World Coordinate System” (WCS). “fk5” tells you that the sky coordinate system is de- fined using the reference stars of the Fifth Fundamen- tal Catalogue (Fundamentalkatalog 5), which was pub- lished in 1988 by Astronomisches Recheninstitut Heidel- berg (ARI, now a part of Heidelberg University). The coordinates themselves are those of the equatorial sys- tem, which is analogous to latitude and longitude on Earth: longitude corresponds to the right ascension α (sometimes abbreviated to RA, ra, or R.A.), latitude cor- responds to the declination δ (sometimes abbreviated to Dec, dec, or DE). Both RA and Dec are given here in the standard sex- agesimal notation. Right ascension is given in hours (in hour example: 18), minutes (18) and seconds (50.2864), written as 18:18:50.2864. (In other contexts, it might be written as 18h^18 m^50 s.2864, but the meaning is al- ways the same: Each object lies on a particular meridian (that is, on part of a great circle through the two poles of the coordinate system). The value of the right ascen- sion indicates where that meridian intersects the celestial equator, measured eastwards along the celestial equator, starting from the vernal point — the point where the Sun crosses the celestial equator northwards, at the time of the vernal equinox (at or around March 20 each year). For this measurement, 24 hours correspond to the full circle. The integer hour value is followed by the number of full minutes, with 60 minutes in a full hour, as ex- pected; the third position gives the seconds, again with 60 seconds in a minute. Declination is given in degrees northwards (positive