AST 5765/4762 (Advanced) Astronomical Data Analysis

(Advanced) Astronomical Data Analysis

AST 4762/5765 teaches basic data analysis as applied to observational astrophysics and planetary science, such as photometry and spectroscopy frames from CCDs and infrared arrays.

This is a very challenging class. If you are thinking of taking this class, please read this entire page!

It is not:

  • A programming course. Programming is a prerequisite; see below.
  • A course in “big data”. The class does teach automated analysis of observational astronomy data, and a few of the skills taught here do transfer well to big-data applications. However, these occupy only a portion of the class.

It is:

…an intense, hands-on, programming-based lab course. The class covers:

  • Probability and the theory of measurement uncertainties
  • Astronomical detectors and data formation
  • Photometry
  • Spectroscopy
  • Research paper writing and the publication process
  • A sampling of numerical methods

Most assignments are programs written by the student.  The last five homework assignments are a single data analysis coded in stages. The final project is a second data analysis, including a paper in the style of The Astrophysical Journal.

The students who do best in this class tend to be headed for graduate school and careers in planetary science or astronomy.

The rest of what students need in the class (data, routines, etc.) is in WebCourses.

AST 4762/5765 satisfies part of the Physics BS lab requirement, by petition.


Is this class for me?

If you will be doing research in astronomy, you have the prerequisites and the time, and you are a good programmer, this class is for you.  It has a longer-than-normal class time because we integrate the lab exercises, front-loading lecture at the start of the semester and emphasizing project work toward the end.  Please see the syllabi, handouts, and assignments, available at the link above, to get an idea for what’s involved.

WORKLOAD: AST 4762 is really an undergrad registration option for a grad class, with a somewhat-easier final project.  The workload is high, higher than Senior Design in Engineering (according to students who have taken both).  Other classes say you will need to spend 6-10 hours a week.  For this class, it’s really true, every week.  Ask students who have taken the class, or see below.  Also, you need to be a good computer programmer.

FORMAT: Both undergrads and grads sit in the same lecture, but the homework has extra work for the grads and the final project, worth 30% of the grade, is more challenging for the graduates. Undergraduate students who have had some data analysis experience, who can program well, and who have the time may take the grad version.  It will look good on a grad school application and you will learn more.  But, it is more work, especially the project.

The math needed for this class is differential equations for error analysis and calculus or algebra for the rest.  However, this is an intense class, because most of the work is programming assignments and it takes time to write good programs that run well.  We will discuss methods that make it quicker, but even so you can expect an average of 6-10 hours a week outside of class, and 15+ hours/week if you are struggling.  The final project is all that is due in finals week, and it is assigned about a month before it is due (there is no exam).

PREREQ: Ability to program a computer.

There will be a programming assessment at the beginning of the first class period.  Those who cannot do this in-class assignment well will be deregistered from the course as not meeting the programming requirement.

The single best predictor of course success is ability to program a computer WELL on the first day of the course, so programming ability is a formal prerequisite. Doing well (B or better) in PHZ 3150 Introduction to Numerical Computing should prepare you well, as it was specifically designed to prepare students for this course.  Other courses, including Computer Science 1, do not automatically satisfy the prerequisite, but they are excellent preparation if you keep programming afterward.

Most students who come in without programming ability have struggled in the course.  Several have failed, even after trying hard, and many have dropped.  We use the Python language, and spend the first 2 weeks getting up to speed in Python.  That’s reasonable if you know how to program in another language, but it’s very fast if you’ve never programmed.  Having taken computer classes, especially in the distant past, is not enough.  You actually need to have put that knowledge to use and be comfortable with programming to solve real problems.  Conversely, if you can program, you don’t need to have taken any class.  If you cannot write a program (right now, no looking at notes or other sources of help) to read a file of numbers, calculate the mean and find the minimum and maximum, and print the results, do not take this course!

If you wish to learn on your own, the class uses Python. Python is one of the easiest languages to learn, and there are several good books on it.  Here is the handout we use in the first 2 weeks of the course. Working through even the easiest of the books on the first page should prepare you well.  If you get into the fun stuff on the second page of the handout, you’re working ahead, which is great.  All that’s needed is basic skills: using variables well, loops and conditionals, functions, and getting your program to run in an operating system.


Linux Working Environment on Your Computer

You must bring your own computer to class. It must be able to run VirtualBox or another virtualization environment, and must have a pointer (mouse), keyboard, and enough RAM and free disk space to run VirtualBox (4GB RAM in addition to the native OS and 20GB free space). A large screen is recommended, as we will be using many windows simultaneously. We will help you set up a virtual Linux environment with Python and a few other tools, which you must use for the class. Mac users can install these tools natively, without the virtual environment, if they wish. All software used in the class is free (speech and beer). You are responsible for managing your own computer during the class.

Prereq: A 3000-level AST Course

Since the class does not teach any actual astronomy, a course in astronomy that involves mathematics is required, or significant other work in astronomy, to ensure that you understand the nature of the data we will use and why we are analyzing it.  Besides a class, there are many other ways to demonstrate this knowledge, including:

  • Be doing or have done research in astronomy or planetary science, here or elsewhere, for at least a semester, or concurrently, if registered for credit.
  • Be an active member of the Astronomy Society who helps with open houses and is generally conversant with astronomy topics, the sky, etc.
  • Be a serious amateur astronomer or citizen scientist in astronomy.
  • Join a self-study group that works through an astronomy textbook or a series of literature readings that involve mathematics.

What Students Say

“To be fair, all professors say to expect to spend 6-10 hours on their material outside of class. This was the first time it was actually true.” – CT, senior majoring in Mechanical Engineering, 2014

If you have any questions, please ask!