# -*- coding: utf-8 -*-
# ---
# jupyter:
# jupytext:
# custom_cell_magics: kql
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.11.2
# kernelspec:
# display_name: pic312
# language: python
# name: python3
# ---
# %% [markdown] id="un2r-8IJ4Fos"
# # Basics of analyzing transmission spectra with JWST
#
# You should have already installed PICASO (which can be found [here](https://natashabatalha.github.io/picaso/installation.html)). In this tutorial you will learn how to model the transmission spectrum of Wasp-39b. This entails learning the basics of:
#
# 1. Transit transmission spectra modeling
# 2. Data-to-model comparison via chi-square statistic
# 3. Climate modeling
# 4. Cloud modeling
#
# **NOTE: This tutorial is aimed at both beginner and advanced levels.** It is comprehensive and includes various locations to check understanding.
# %% [markdown] id="6b13e969-791b-4a38-885a-82732ba5627d"
# # Check PICASO Imports
# %% [markdown] id="0a481e95-1809-43e4-8485-6ce5652ddc58"
# Here are the two main PICASO functions you will be exploring:
#
# `justdoit` contains all the spectroscopic modeling functionality you will need in these exercises.
#
# `justplotit` contains all the of the plotting functionality you will need in these exercises.
#
# Tips if you are not familiar with Python or `jupyter notebooks`:
#
# - Run a cell by clicking shift-enter. You can always go back and edit cells. But, make sure to rerun them if you edit it. You can check the order in which you have run your cells by looking at the bracket numbers (e.g. [1]) next to each cell.
#
# - In any cell you can write `help(INSERT_FUNCTION)` and it will give you documentation on the input/output
#
# - If you type `jdi.` followed by "tab" a box will pop up with all the available functions in `jdi`. This applies to any python function (e.g. `numpy`, `pandas`)
#
# - If you type class.function?, for example `jdi.mean_regrid`, it will describe the function and its parameters/returns. This also applies to any class with a function.
# %% [markdown] id="b9ba3b9f"
# ## Make sure we have the right data
#
# 1. Are you a student and want to quickly run this without going through full PICASO data install setup? **PROCEED TO A do not edit B**
#
# 2. Have you already installed picaso, set reference variables, and have an understanding of how to get new data products associated with PICASO? **PROCEED TO edit B.**
# %%
import picaso.data as d
picaso_refdata = '/data/test/tutorial/picaso-lite-reference' #change to where you want this to live
"""
A) Uncomment if you are need the picaso-lite data
"""
d.os.environ['picaso_refdata'] = picaso_refdata
#if you do not yet have the picaso reference data complete this step below
#d.get_data(category_download='picaso-lite', target_download='tutorial_sagan23',final_destination_dir=picaso_refdata)
"""
B) Edit accordingly if needed (no need to edit if completed "A" above)
"""
#picaso ref data, if need to set it
d.os.environ['picaso_refdata'] = picaso_refdata
#stellar data environment
d.os.environ['PYSYN_CDBS'] = d.os.path.join(d.os.environ['picaso_refdata'],'stellar_grids')
#path to virga files
mieff_dir = d.os.path.join(d.os.environ['picaso_refdata'],'virga')
#path to correlated k preweighted files
ck_dir = d.os.path.join(d.os.environ['picaso_refdata'],'opacities', 'preweighted')
#"""
#lets check your environment
d.check_environ()
#dont have the data? return to the step A above to use the get_data function
# %% id="P_mk7nKI78GM"
import os
# Check you have picaso
from picaso import justdoit as jdi
from picaso import justplotit as jpi
import picaso.opacity_factory as op
import numpy as np
jpi.output_notebook()
# %% [markdown] id="9698f8d6-21c1-4230-a9a9-253232319d49"
# # Observed Spectrum
#
# Before we start modeling a planet to match WASP-39 b, let's get hold of WASP-39 b's actual observed spectrum in data format, so we can plot it next to our modeled ones. If you did [part 1 of this tutorial](https://github.com/Kappibw/JWST/blob/main/2_retrieving_jwst_spectra.ipynb), you already downloaded the data prepared by the scientists who wrote [the CO2 discovery paper](https://arxiv.org/pdf/2208.11692.pdf).
#
# Let's practice download the data from [Zenodo which is a popular place to host reduced data](https://zenodo.org/record/6959427#.Yx936-zMJqv). Download the .zip, and then look for `ZENODO/TRANSMISSION_SPECTRA_DATA/EUREKA_REDUCTION.txt`.
#
# %% id="ed9d8643-e8b9-43a3-a933-d347b6110a65"
eureka_reduction_path = '/data2/observations/WASP-39b/ZENODO/TRANSMISSION_SPECTRA_DATA/EUREKA_REDUCTION.txt'
# %% id="qdLa7P4lqUJD"
# Import ascii
from astropy.io import ascii
# Confirm can read the file
observed_data = ascii.read(eureka_reduction_path)
# %% id="353ce54e-3a05-4f42-b570-ef1769a51a89"
observed_data.colnames
# %% [markdown]
# Let's plot the observed data and see what we are working with!
# %% id="c12047e4-1d2a-4132-8c65-644d463d9ba3"
plt=jpi.plt
plt.figure(figsize=(12,4))
plt.errorbar(observed_data['wavelength'], observed_data['tr_depth'],
[observed_data['tr_depth_errneg'], observed_data['tr_depth_errpos']],fmt='o')
plt.title('WASP-39 b Observed Spectrum')
plt.yscale('log')
plt.ylabel('Transit Depth')
plt.xlabel('Wavelength (micrometers)')
plt.show()
# %% [markdown] id="6d21b8cc-f53a-4bd4-85fc-6db71f9309af"
# # Spectra Ingredients
#
# Now what? Let's slowly build up a model that can match this data. What do we need?
#
# ## PICASO Basics
#
#
# ### List of what you will need before getting started
#
# 1. Planet properties
#
# - planet radius
# - planet mass
# - planet's equilibrium temperature
#
# 2. Stellar properties
#
# - stellar log gravity
# - stellar effective temperature
# - stellar metallicity
#
# 3. Opacities (how do molecules and atoms absorb light under different pressure/temperature conditions)
#
#
# %% [markdown] id="1e0d9c4c"
# # Basic Inputs
#
# ## Cross Section Connection
#
# All rapid radiative transfer codes rely on a database of pre-computed cross sections. Cross sections are computed by using line lists in combination with critical molecular pressure broadening parameters. Both can either be derived from theoretical first principles (e.g., [UCL ExoMol's line lists](https://www.exomol.com/)), measured in a lab, and/or some combination thereof (e.g., [HITRAN/HITEMP line lists](https://hitran.org/)).
#
# When cross sections are initially computed, a resolution ($\lambda/\Delta \lambda$) is assumed. Cross sections are computed on a line-by-line nature and therefore usually computed for R~1e6. For JWST we are often interested in large bands (e.g. 1-14 $\mu$m). Therefore we need creative ways to speed up these runs. You will usually find one of two methods: correlated-k tables and resampled cross sections. [Garland et al. 2019](https://arxiv.org/pdf/1903.03997.pdf) is a good resource on the differences between these two.
#
# For this demonstration we will use the resampled cross section method. **The major thing to note about using resampled cross sections** is that you have to compute your model at ~100x higher resolution than your data, but then must bin it down to a comparable resolution as your data so you can compare them. You will note that the opacity file you downloaded is resampled at R=10,000. Therefore you will note that **in this tutorial we will always bin the model down to R=100** before comparing with the data.
#
# **The overall idea is that we are just simply initializing PICASO to operate, create a "connection" to the models/spectral tools, in that wavelength (in microns) with the respective opacity range.**
# %% id="1c4f1eb8-1721-4617-bec3-073a665fab3a"
opa = jdi.opannection(wave_range=[2.7,6])
# %% [markdown] id="a13f6558"
# ## Set Basic Planet and Stellar Inputs
#
# Second step is to define the basic planet parameters. Depending on the kind of model you want to compute (transmission vs. emission vs. reflected light), there are different requirements for the minimum set of information you need to include.
#
# For WASP-39 b, since we have planet mass, radius, and all the necessary stellar specifications, we will be thorough in our inputs and include all parameters.
# %% [markdown]
# We can create an object called w39 that represents the planet WASP-39 b. We then assign values to that object. We already know the temperature, radius, metallicity, and surface gravity of the host star, so we can assign those values to the parent star under w39.star. We also know the mass and radius of the planet, so we can assign those values to the planet under w39.gravity (named thus because mass and radius will be used to compute the planet's surface gravity). As we move through the tutorial and make climate models etc., we will be doing these calculations on the object w39.
# %% id="ce1ce8d8"
# Create that object or "case"
w39 = jdi.inputs()
# Describe the star
w39.star(opa, temp=5400 , database='phoenix',
metal=0.01, logg=4.45, radius=0.9, radius_unit=jdi.u.R_sun )
# Describe the planet
w39.gravity(mass=0.28, mass_unit=jdi.u.M_jup,
radius=1.27, radius_unit=jdi.u.R_jup)
# %% id="8eaf1915"
# To get information about a function, you can use '?'
help(w39.star)
# %%
# want to see the spectrum you input? it lives here
w39.inputs['star']
# %% [markdown] id="cbd735be"
# ## Set Climate and Chemistry
#
# We now need to think about how we can model the climate and chemistry of this system. For the sake of this tutorial we will start really simple, then move forward to something more complex.
#
# **For climate**, we will explore these levels in this tutorial
#
# 1. Isothermal (now)
# 2. Radiative-convective (next section)
#
# **For chemistry**, we will explore these levels in this tutorial
#
# 1. Chemical Equilibrium
#
# %% [markdown] id="ec4bd5a4-9dc7-498a-8db9-e7e3c392509b"
# ### Pressure
# If we imagine our "nlevels" as equally spaced altitude bands on the planet, then we will assign pressures to decrease logarithmically as altitude increases.
#
# Gas is compressible and tends to behave in that way in planetary atmospheres (including on Earth).
# %% id="774e33dc-80fd-4a99-9a2f-c8b611c301b9"
nlevels = 50
# Logspace goes from base^(start) to base^(end)
# so here we are going from 10^-6 to 10^2, which is
# 1 microbar to 100 bars of pressure.
# This is an arbitrarily chosen range, but this is the most common.
pressure = np.logspace(-6,2,nlevels)
# %%
print(pressure)
# %% [markdown] id="a1df5148-9cc7-49ff-b19e-3ffeb57d5750"
# ### Isothermal Temperature
# %% [markdown]
# Next we need to decide how the temperature of the atmosphere varies with pressure. On Earth, temperature generally drops as you travel further from the Earth's surface, i.e. higher in altitude and lower in pressure. But that is not always the case. For simplicity, we'll start by assuming the temperature of WASP-39 b's atmosphere is constant with pressure, which is called "isothermal." As our models get more sophisticated, we'll be able to refer back to this simple case.
# %% id="e57e4846-fe43-4731-8cf6-978672da70d2"
# We can see from exo.MAST that the equilibrium temp
# of WASP 39 b is 1120 kelvin, so let's use a scale
# of temperatures based on that.
equilibrium_temperature = 1120.55
isothermal_temperature = np.zeros(nlevels) + equilibrium_temperature
# %%
print(isothermal_temperature)
# %% [markdown] id="05e13ad1-1728-4dfb-9638-05b761608172"
# #### Setting the Atmosphere in PICASO
# %% [markdown]
# So far we have described the parent star and the planet. Now let's define the planet's atmosphere. Note, this is a common workflow, where you start by creating an object (w39) and then slowly add parameters (information) as you go.
# %% id="475d82ed-51c6-45c3-8f12-3d8472ee3ea2"
w39.atmosphere(df = jdi.pd.DataFrame({
'pressure':pressure,
'temperature':isothermal_temperature}))
# %%
# want to see the input added to the class
w39.inputs['atmosphere']['profile']
# %% [markdown] id="f5016fc9-1275-4a55-b9ec-adccceae4921"
# ### Chemistry
#
# Now we need to add the chemistry! PICASO has a prebuilt chemistry table that was computed by Channon Visscher. You can use it by adding it to your input case. Two more chemistry parameters are now going to be introduced:
#
# 1. M/H: Atmospheric metallicity
# 2. C/O ratio: Elemental carbon to oxygen ratio
#
# %% [markdown] id="bb981137-ccbc-4f77-9cad-e7fb35d397ea"
# #### Metallicity
#
#
#
# Looking at a mass-metallicity plot (compiled in [Wakeford & Dalba 2020](https://ui.adsabs.harvard.edu/abs/2020RSPTA.37800054W/abstract)) might offer a good starting point to decide what the M/H of your planet might be. Here we can see WASP-39 b HST observations led to the inference of ~100xM/H. One tactic might be to start from that estimate. Another might be to use the Solar System extrapolated value (gray dashed line) as a first pass. Let's start with the latter as a first guess.
# %% id="9bc13061-939e-4cf5-b7fb-081e35d60100"
log_mh = 1.0 # log relative to solar
# so a value of 1 here represents 10^1 = 10x solar
# %% [markdown] id="12494d50-6908-4ee4-85c4-5ce31fd580ec"
# #### C/O Ratio
#
# The elemental ratio of carbon to oxygen controls the dominant carbon-bearing species. For instance, take a look at Figure 1 from the paper [C/O RATIO AS A DIMENSION FOR CHARACTERIZING EXOPLANETARY ATMOSPHERES](https://iopscience.iop.org/article/10.1088/0004-637X/758/1/36/pdf).
#
# At low C/O, we see CO$_2$ and CO as the dominant form of carbon, and at high C/O we see CH$_4$ and CO as the dominant form of carbon.
#
# C/O is given in PICASO in units relative to solar C/O, which is worth noting because you're not giving it the actual ratio of carbon to oxygen, but rather the ratio relative to the Solar C/O. Solar C/O is ~0.5 ([Asplund et al. 2021](https://ui.adsabs.harvard.edu/abs/2021A%26A...653A.141A/abstract)), and let's set our value to be the same as Solar (i.e. 1 relative to Solar).
# %% id="27069c4b-0a92-45f6-a21b-ed56682aef57"
c_o = 0.55 # absolute solar
# %% [markdown] id="afc5200e-7afe-4366-bc8c-cb132951885c"
# Now we can ask PICASO to make us a mixture of molecules consistent with the relative M/H metallicity and relative C/O ratio we set up, and we can take a look at what it creates. We can look at the "atmosphere profile" which shows us the abundances of various molecules at different levels of our atmosphere (levels being places where temperature and pressure differ in the manner we defined above).
#
# This can help us find out which molecules we should really focus on during our analysis, and others that may be harder or redundant to look for based off of our C/O ratio and metallicity.
# %% id="c618536e-3990-48b0-bb7e-516d1ab83223"
w39.chemeq_visscher_2121(c_o, log_mh)
# %% [markdown] id="80b1b45d-eb0a-4373-8f83-165807d47e92"
# Running this function sets a dataframe in your inputs dictionary, which you can access now with `w39.inputs['atmosphere']['profile']`. You can see that PICASO has given us loads of different molecules to work with, but many have miniscule abundances (note some e-38 values in there).
# %%
w39.inputs['atmosphere']['profile'].head()
# %% [markdown] id="98aeb8fc-4b53-4457-870a-a899affc9a9c"
# ### Reference Pressure
#
# Lastly, we need to decide on a "reference pressure." If our planet was terrestrial, this would be the pressure at the surface, and therefore also the pressure corresponding to the radius of the planet. For gas giants like WASP-39 b, this is a bit more complicated -- there is no "surface," so we need to pick a pressure that corresponds to our planet's "radius" or surface, so that PICASO can calculate gravity as a function of altitude from that level.
#
# We are selecting a pressure that we are essentially calling the bottom.
#
# As we've discussed, a planet's radius changes depending on the wavelength at which you observe it -- it's that change that we are measuring with our spectra. But when we input the planet radius above, we picked a single number -- 1.27 `r_jup`. That number is an average calculated over a band of wavelengths. And so when we pick a reference pressure, we estimate roughly what level of the planet's atmosphere that averaged radius corresponds to (where, from the high pressure deep inside, to the low pressure at the exterior, does the chosen radius fall?).
#
# PICASO suggests a reference pressure of 10 bar for gas giants, so we will start with that:
# %% id="7a1b1cc7-37ef-4960-8512-f4d77f326fd4"
w39.approx(p_reference=10)
# %% [markdown] id="969c933b"
# ### Want to check your inputs so far?
#
# If you want you can consult `w39.inputs` to check or reset inputs. Let's see how our WASP-39 b object is holding up!
# %% id="3dc8c591"
w39.inputs.keys()
# %% [markdown]
# An atmosphere profile table shows the elemental abundances at each pressure level in our model atmosphere.
# %% id="7be17b68"
w39.inputs['atmosphere']['profile'].head()
# %% [markdown]
# We can look at the abundances for a single molecule, say CO$_2$.
# %% id="d62cbd84"
# Grab CO2 array, for instance
w39.inputs['atmosphere']['profile']['CO2'].values
# %% [markdown]
# We can check our inputs for the planet and the host star.
# %% id="61388471"
w39.inputs['planet'], w39.inputs['star'] # All your inputs have been archived!
# %% [markdown] id="4ed9002e-85a3-4b3b-a20f-2750e7004847"
# # Creating a Transmission Spectrum
#
# Now that we have set up PICASO with everything it needs, and we understand the components needed to model an exoplanet, let's ask PICASO to output a transmission spectrum for our WASP-39 b.
#
# ## First Run PICASO
#
# We can use the .spectrum function to do so.
#
# We need to give the function a connection to the opacity database we used earlier to look up the absorption spectrum for water, as well as an instruction to create a "transmission" spectrum (as opposed to, for example, a reflected light spectrum of star light bouncing off the planet when it's almost behind its star).
# %% id="bcf63c09-64b4-4122-b56e-24445bb489ce"
model_iso = w39.spectrum(opa,
# Other options are "thermal" or "reflected"
# or a combination of two e.g. "transmission+thermal"
calculation='transmission',
full_output=True)
# %% [markdown] id="ba6bba7f-3eb9-4a8a-8c5c-e88c7c371f7d"
# ## Our Spectra
#
# Let's set up a function to display the spectrum in our first output dictionary (first one is called `model_iso`). Moving forward we will create more models and we want a way to easily display them.
# %% id="42dd80a1-e613-4038-a71a-f16b76c9cf11"
def show_spectra(output, x_range_min=3.0, x_range_max=5.5):
# Step 1) Regrid model to be on the same x axis as the data
wnos, transit_depth = jdi.mean_regrid(output['wavenumber'],
output['transit_depth'],
newx=sorted(1e4/observed_data['wavelength']))
# Step 2) Use PICASO function to create a plot with the models
fig = jpi.spectrum(wnos, transit_depth,
plot_width=800,y_axis_label='Absolute (Rp/Rs)^2',
x_range=(x_range_min,x_range_max))
# Step 3) Use PICASO function to add the data so we can compare
jpi.plot_multierror(observed_data['wavelength'], observed_data['tr_depth'], fig,
dy_low=observed_data['tr_depth_errneg'], dy_up=observed_data['tr_depth_errpos'],
dx_low=[0],dx_up=[0], point_kwargs={'color':'black'})
jpi.show(fig)
# %% id="fe9cb93d-69af-4436-ac40-77c1b5e358df"
# Display our first spectrum!
show_spectra(model_iso)
# %% [markdown]
# Sweet! We have a spectrum that looks the right-ish shape, even though it isn't quite in the right place. Let's take a minute to work that out.
# %% [markdown] id="bd05513b-9193-4e3c-b39e-74a8c877d446"
# ### Transit Depth Offsets
#
# Remember how we guessed what the reference pressure was? Well, it looks like we are a little off. That is okay! When fitting for transit spectra, we introduce a factor to account for this. In the retrieval tutorial, you will fit for a factor of the radius. For now, let's "mean subtract" our data so that the model and data lie on top of one another.
# %% id="ea7f587f"
# Let's mean subtract these
def show_spectra(output, x_range_min=3.0, x_range_max=5.5):
# Step 1) Regrid model to be on the same x axis as the data
wnos, transit_depth = jdi.mean_regrid(output['wavenumber'],
output['transit_depth'],
newx=sorted(1e4/observed_data['wavelength']))
# Step 2) Use PICASO function to create a plot with the models
fig = jpi.spectrum(wnos, transit_depth-np.mean(transit_depth),
plot_width=800, y_axis_label='Relative (Rp/Rs)^2',
x_range=(x_range_min,x_range_max))
# Step 3) Use PICASO function to add the data so we can compare
jpi.plot_multierror(observed_data['wavelength'],
observed_data['tr_depth'] - np.mean(observed_data['tr_depth']),
fig,
dy_low=observed_data['tr_depth_errneg'], dy_up=observed_data['tr_depth_errpos'],
dx_low=[0],dx_up=[0], point_kwargs={'color':'black'})
jpi.show(fig)
# %% id="d8525969-4251-4c5b-8bf9-17a71c3e8116"
show_spectra(model_iso)
# %% [markdown] id="56c298ba"
# ## Model Investigation
#
# Before trying to improve the complexity of your model, let's make sure you know how to analyze the inputs.
#
# ### Identifying molecular features with optical depth contribution plot
#
# `taus_per_layer` - Each dictionary entry is a nlayer x nwave that represents the per layer optical depth for that molecule.
#
# `cumsum_taus` - Each dictionary entry is a nlevel x nwave that represents the cumulative summed opacity for that molecule.
#
# `tau_p_surface` - Each dictionary entry is a nwave array that represents the pressure level where the cumulative opacity reaches the value specified by the user through at_tau.
#
#
# %% [markdown]
# Let's create a molecule contribution plot. At any given wavelength, this will show us at which pressure layer (`tau_p_surface`) each molecule experiences an optical depth of $\tau$. Below, we will set $\tau = 1$. This is useful because whenever we see a spectral feature, that light is originating from a depth in the atmosphere corresponding to the $\tau=1$ surface for that molecule.
# %% id="0c06856a"
molecule_contribution = jdi.get_contribution(w39, opa, at_tau=1)
# %% id="4639440f"
jpi.show(jpi.molecule_contribution(molecule_contribution,
opa, plot_width=700, x_axis_type='log'))
# %% [markdown] id="8d9d34fc"
# ### Identifying molecular features with "leave-one-out" method
#
# Another option for investigating model output is to remove the contribution of one gas from the model to see if it affects our spectrum. CO$_2$ was a fairly obvious feature for the 4.3$\mu$m. But what about H$_2$O and CO from 4.4-6$\mu$m. In this region there is no distinct "feature" in the spectrum. How sure are we that H$_2$O and CO are really there? We can use the "leave-one-out" method to see how individual molecules are shaping each part of our spectrum.
#
# We can use PICASO's `exclude_mol` key to exclude the optical contribution from one molecule at a time. In the code below lets do this for CO2, H2O, and CO. We will loop through one by one and create a spectra without each of these. The result will show you where their contribution to the spectra is most important.
# %% id="a457832c"
w,f,l =[],[],[]
df_og = jdi.copy.deepcopy(w39.inputs['atmosphere']['profile'])
for iex in ['CO2','H2O','CO',None]:
w39.atmosphere(df=df_og,exclude_mol=iex, sep=r'\s+')
df= w39.spectrum(opa, full_output=True,calculation='transmission') #note the new last key
wno, rprs2 = df['wavenumber'] , df['transit_depth']
wno, rprs2 = jdi.mean_regrid(wno, rprs2, R=150)
w +=[wno]
f +=[rprs2]
if iex==None:
leg='all'
else:
leg = f'No {iex}'
l+=[leg]
jpi.show(jpi.spectrum(w,f,legend=l))
# %% [markdown]
# We can see clearly that when H$_2$O or CO$_2$ is not present, a completely different spectrum is created that veers far from our model. Therefore we can feel confident that H$_2$O or CO$_2$ are truly present. When we try leaving out CO, however, the spectrum changes more modestly; it appears that including CO improves the spectrum somewhat, but we are less certain about its presence.
# %% [markdown] id="b1aab798-3d33-4582-9952-94e45e686474"
# # Increasing Model Complexity to Improve Fit: Chemistry, Climate, Clouds
#
# In the next sections we will try to improve our model fit. However, before we continue we need a way to quantify our "goodness of fit."
# %% [markdown] id="8dbbe171-a38f-4bee-8b59-93e3108d6ca4"
# ## Define Goodness of Fit
#
# Let's implement a simple measurement of error called a "chi-squared test." This is a commonly used method to measure how well you are fitting data with a model, and sums up the distance between your model's output and the observed data at each data point.
#
# The standard chi-squared test formula is $\chi^2 = \sum \frac{(O_i - E_i)^2}{(\sigma_i)^2}$ where $O$ is the observed data, $E$ is the expected value (i.e. our model), and $\sigma$ is error.
#
# In this notebook, we'll compute chi-squared *per data point*, which means we are normalizing the standard chi-square by dividing by number of data points. The chi-squared per data point formula is $\frac{\chi^2}{N}$ where $N$ is the number of data points.
#
# Note, another common way to report chi-squared is the *reduced* chi-squared which considers the degrees of freedom $(\nu)$, where $\nu$ is the number of data points ($N$) minus the number of fitted parameters.
#
# Regardless of the version of chi-squared you use, if the result is close to 1, that is considered a good fit.
# %% id="9054e287-dcbb-45c6-a21c-db2fcd9828ce"
def chisqr(model):
# Step 1) Regrid model to be on the same x axis as the data
wnos, model_binned = jdi.mean_regrid(model['wavenumber'],
model['transit_depth'],
newx=sorted(1e4/observed_data['wavelength']))
# Step 2) Flip model so that it is increasing with wavelength, not wavenumber
model_binned = model_binned[::-1]-np.mean(model_binned)
# Step 3) Compute chi sq with mean subtraction
observed = observed_data['tr_depth'] - np.mean(observed_data['tr_depth'])
error = (observed_data['tr_depth_errneg'] + observed_data['tr_depth_errpos'])/2
return np.sum(((observed-model_binned)/error)**2 ) / len(model_binned)
# %% id="76d4f75c"
print('Simple First Guess', chisqr(model_iso))
# %% [markdown] id="72a5a724"
# Not great! Let's make it better.
# %% [markdown] id="1fb463d1"
# ## Revisiting Chemistry Assumption
#
# Earlier, we assumed M/H and C/O values. Let's loop through a few M/H and C/O values to see if any of these seem to improve our model fit. Why? These values are faster to assess than climate or clouds, and they're easier to loop through while you're building intuition. We want to find the best fit combination of variables such as C/O and M/H that will gives us the closest fit to the data.
# %% id="ae270058"
mh_grid_vals = [1, 10, 100]
co_grid_vals = [0.55, 1.3] #absolute c/o ratios
chemistry_grid = {}
for imh in mh_grid_vals:
for ico in co_grid_vals:
w39.chemeq_visscher_2121(ico, np.log10(imh))
chemistry_grid[f'M/H={imh},C/O={ico}'] = w39.spectrum(opa, calculation='transmission', full_output=True)
print(f'M/H={imh},C/O={ico}', chisqr(chemistry_grid[f'M/H={imh},C/O={ico}'] ))
# %% [markdown]
# We see with a M/H of 100 and C/O of 1, we get the best fit where the chi-squared is ~4. Let's plot all these and visually confirm.
# %% id="0f5bf193"
# Let's edit our function once again to allow for multiple model inputs
def show_spectra(output, x_range_min=3.0, x_range_max=5.5):
#Step 1) Regrid model to be on the same x axis as the data
wnos, transit_depth = zip(*[jdi.mean_regrid(output[x]['wavenumber'], output[x]['transit_depth'],
newx=sorted(1e4/observed_data['wavelength']))
for x in output])
transit_depth = [x-np.min(x) for x in transit_depth]
wnos = [x for x in wnos]
legends = [i for i in output]
# Step 2) Use picaso function to create a plot with the models
fig = jpi.spectrum(wnos, transit_depth, legend=legends,
plot_width=800,y_axis_label='Relative (Rp/Rs)^2',
x_range=(x_range_min,x_range_max))
# Step 3) Use picaso function to add the data so we can compare
jpi.plot_multierror(observed_data['wavelength'],
observed_data['tr_depth'] - np.min(observed_data['tr_depth']),
fig,
dy_low=observed_data['tr_depth_errneg'], dy_up=observed_data['tr_depth_errpos'],
dx_low=[0], dx_up=[0], point_kwargs={'color':'black'})
jpi.show(fig)
# %% id="25737413"
show_spectra(chemistry_grid)
# %% [markdown]
# ### Check your understanding
#
#
# Let's investigate how our chemistry choices affect our model spectrum. Explore the figure above by clicking on the legend to remove different lines.
#
# 1. Try looking only at lines that keep C/O steady while varying M/H from 1 to 10 to 100. What impact does the change in M/H ratio have on the spectrum?
#
# 2. You'll notice that as M/H increases, the CO$_2$ bump near 4.4 $\mu$m initially grows and then shrinks. Why might this be?
#
# 3. Now look at lines that keep M/H steady while varying C/O from 1 to 2.5. What impact does this change in C/O ratio have on the spectrum?
#
# 4. You'll notice that as C/O increases, the CO$_2$ bump near 4.4 $\mu$m disappears while a new CH$_4$ line appears near 3.35 $\mu$m. Why might this be?
#
# Some more targeted questions to help with the above:
#
# 1. Figure out what the top 4 most abundant oxygen- and carbon-bearing species are by looking at your chemistry grid.
# 2. How does their abundance change when you vary the value of M/H?
# 3. How does their abundance change when we increase C/O from 1 to 2.5?
# 4. How does mean the atmosphere's molecular weight change when you vary M/H?
# 5. Can you see that reflected in the spectrum?
#
# %% [markdown]
#