Getting started#

To demonstrate a typical use case for epsf, we will reproduce results from Calissendorff et al. (2023), which presented the first Y+Y dwarf binary detection. To keep the tutorial simple, we will fit a single JWST image.

Loading the Data#

The JWST stage 2 image pipeline produces cal files, stacking all integrations in an observation. The science image is in the SCI extension and its uncertainty in the ERR extension. These are wide-field images, so we will need to extract a cutout from the image and error arrays. We know the position of the Y dwarf in this image, so we will hardcode it and take a 68x68 pixels region around it.

import numpy as np
from pathlib import Path
from astropy.io import fits
from astroquery.mast import Observations

tbl = Observations.query_criteria(
    proposal_id="02473",
    project="JWST",
    filters=["F480M", "F150W"],
)

x, y = 1281, 825
half_size = 34


img_path = Path("./data/jw02473053001_03101_00002_nrcblong_cal.fits")
img_path.parent.mkdir(exist_ok=True)
status, msg, _ = Observations.download_file(
    "mast:JWST/product/jw02473053001_03101_00002_nrcblong_cal.fits", local_path=img_path
)
hdul = fits.open(img_path)
img_full = hdul["SCI"].data
err_full = hdul["ERR"].data

crop_ind = np.s_[y - half_size : y + half_size, x - half_size : x + half_size]
img = img_full[crop_ind]
err = err_full[crop_ind]
img_pre_crop = img.copy()
err_pre_crop = err.copy()
INFO: Found cached file data/jw02473053001_03101_00002_nrcblong_cal.fits with expected size 117576000. [astroquery.query]
import matplotlib.pyplot as plt
from epsf.plot import plot_image, plot_with_diff, plot_flat_samples


fig, axs = plt.subplots(1, 2, figsize=(16, 6))
plot_image(img, ax=axs[0], color_label="e$^{-}$/s")
axs[0].set_title("Image")
plot_image(err, ax=axs[1], color_label="e$^{-}$/s")
axs[1].set_title("Uncertainty")
plt.show()
../_images/a8ab61eef2db3896fc02faa888e9ab99ce84946a9681af2597774472e614291e.png

The 68x68 pixels image size we chose above was an arbitrary choice for display purposes. The effective PSF (ePSF) models from JWST1PASS are defined on an 11x11 pixels grid, so we will need to crop our image a second time. The x-y positions we used at the start were rough estimates useful only to crop the wide-field image. Let us find the centroid of the 68x68 image and then crop it to 11x11. For JWST PSFs, the centroid_quadratic function from photutils seems to do a reasonable job.

from photutils.centroids import centroid_quadratic

xy_centroid = centroid_quadratic(img, mask=np.isnan(img))
xcen_int, ycen_int = np.round(xy_centroid).astype(int)

plot_image(img)
plt.plot(*xy_centroid, "k*", label="Centroid")
plt.plot(xcen_int, ycen_int, "r*", label="Integer Centroid")
plt.title("Data and centroid")
plt.legend()
plt.show()
../_images/e13f6e412ae6e378b976144e0bc8be4bb693ea2979b83b63645664f507d6cd05.png
final_size = 11
start, end = final_size // 2, final_size // 2 + final_size % 2
x1, x2, y1, y2 = xcen_int - start, xcen_int + end, ycen_int - start, ycen_int + end
img = img_pre_crop[y1:y2, x1:x2]
err = err_pre_crop[y1:y2, x1:x2]

plot_image(img)
plt.plot(final_size // 2, final_size // 2, "r*", label="Centroid")
plt.title(f"Image cropped to {final_size}x{final_size}")
plt.show()
../_images/5a5da572473bb1be638ae93bded22ef1f8df9f00ffe810a8599547eb26b05f5d.png

Our data is now ready to be modelled with an ePSF!

Effective PSF#

Note

JWST1PASS ePSFs must first be downloaded. See the installation instructions for more detail.

JWST1PASS ePSFs are stored as fits files with a grid of PSFs and associated detector positions stored in the header. epsf has an EPSFGrid object that stores this grid and can interpolate it to any detector position.

from epsf.grid import EPSFGrid

# There is no psf for module B so we use module A
grid = EPSFGrid.from_params("NIRCAM", "F480M", detector="NRCAL")

Before we interpolate the grid, let us look at the PSFs for all detector positions.

grid.plot()
plt.show()
../_images/a8d71f1cb9e6d80da24e9a82c0e41d9bdeade57c546f994ce7ba732ba8cfcc87.png

To interpolate, we simply need to call the grid object with x and y detector positions, which we already have from the previous section. This will return an EPSF object, which has an array attribute storing the interpolated PSF and a spl attribute with a scipy.interpolate.RectBivariateSpline allowing us to shift the PSF to any detector position in our cutout. By default, the PSF will have a true_size of 11, meaning it can generate images of 11x11 pixels and an oversampling factor of 4, since this is what was used to generate the JWST1PASS PSFs. These two values can be changed by passing them as arguments when creating the EPSF.

psf = grid(x, y)
print(psf)
EPSF(true_size=11, oversampling=4)

Calling the PSF object with x and y positions will generate a PSF shifted by a given amount of pixels in each direction.

test_img = 1200 * psf(3, 3)

fig, axs = plot_with_diff(img, test_img, scale="linear")
axs[0].set_title("Data")
axs[1].set_title("Test ePSF model")
axs[2].set_title("Difference")
plt.show()
../_images/57ab67015f507fe81b9d24c876a7c9f567daa20b896842e3625e941cf2827c8b.png

Bayesian Inference#

To do Bayesian inference, we could directly use the EPSF object above and write log-likelihood and prior functions to use with an MCMC or nested sampling library. However, epsf.bayes provides useful models built with simpple. That way, the likelihood and priors are already defined for us and we can use them with the sampling library of our choice.

Single model#

Let us first define a model for a single source. We include a constant background term as well as a white noise term added in quadrature with our uncertainties.

We will use the UltraNest library to generate posterior samples and estimate the Bayesian evidence.

import simpple.distributions as sdist
from epsf.bayes import SingleModel

parameters = {
    "x0": sdist.Uniform(-1.5, 1.5),
    "y0": sdist.Uniform(-1.5, 1.5),
    "flux": sdist.LogUniform(1e-5, 5000),
    "bkg": sdist.Uniform(-100, 100),
    "sigma": sdist.LogUniform(1e-5, 100),
}
model = SingleModel(parameters, psf)
from ultranest import ReactiveNestedSampler

sampler = ReactiveNestedSampler(
    model.keys(), lambda p: model.log_likelihood(p, img, err), model.prior_transform
)
sampler.run(show_status=False);
[ultranest] Sampling 400 live points from prior ...
[ultranest] Explored until L=-3e+02  
[ultranest] Likelihood function evaluations: 35558
[ultranest]   logZ = -332.9 +- 0.1992
[ultranest] Effective samples strategy satisfied (ESS = 2373.5, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.45+-0.05 nat, need <0.50 nat)
[ultranest] Evidency uncertainty strategy is satisfied (dlogz=0.20, need <0.5)
[ultranest]   logZ error budget: single: 0.25 bs:0.20 tail:0.01 total:0.20 required:<0.50
[ultranest] done iterating.
sampler.plot()
plt.show()
../_images/aa772f251dac7ffb9f8ad0953f5bfb85a1c5fa42999dc6cb93093165ec0c8244.png ../_images/d6dcc262ccbcdc223afb5fbd753f64c9a0b77be91bf7c268cb316f5841af038c.png ../_images/0d6343544137b6dad330de4caa81c2e29e90766f28bddeb4d1003a22b598e58f.png

We can then use the posterior median to compare the model with the data.

med_p = dict(
    zip(
        model.keys(),
        sampler.results["posterior"]["median"],
    )
)
med_mod = model.forward(med_p)

fig, axs = plot_with_diff(img, med_mod, scale="linear")
axs[0].set_title("Data")
axs[1].set_title("Median model")
axs[2].set_title("Difference")
plt.show()
../_images/ef7ea7eecabf9c90489880e61d29abcfad66ce17c7bea6e1cfc8c207cd915bd4.png

It is also useful to visualize posterior samples of the forward model compared to the data. However, this is a bit cumbersome to display for images. We would need a grid of samples and a grid of residuals. As an alternative, we will flatten the image and compare the pixel values with predictions from the model on a 1D plot.

plot_flat_samples(img, err, model.get_posterior_pred(sampler.results["samples"].T, 100))
plt.show()
../_images/93eb71cd42022ab6a1673899e54a475faa2afe3c70d50348471bccf5bc6f1caa.png

Binary model#

We will now repeat the same steps as above, but with a binary model.

from epsf.bayes import BinaryModel

parameters = {
    "x0": sdist.Uniform(-1.5, 1.5),
    "y0": sdist.Uniform(-1.5, 1.5),
    "flux": sdist.LogUniform(1e-5, 5000),
    "bkg": sdist.Uniform(-100, 100),
    "sep": sdist.Uniform(0.5, 5),
    "pa": sdist.Uniform(0, 360),
    "cr": sdist.LogUniform(1e-4, 1.0),
    "sigma": sdist.LogUniform(1e-4, 100.0),
}
model_binary = BinaryModel(parameters, psf)

Since we have more parameters, UltraNest will struggle to generate valid samples in a high-dimensional space. Instead of region sampling, we add a stepsampler to our nested sampler. This enables a more efficient exploration of parameter space.

from ultranest.stepsampler import SliceSampler, generate_mixture_random_direction

wrapped_params = [pname == "pa" for pname in model_binary.keys()]
nsteps = len(model.keys()) * 2
sampler_binary = ReactiveNestedSampler(
    model_binary.keys(),
    lambda p: model_binary.log_likelihood(p, img, err),
    model_binary.prior_transform,
    wrapped_params=wrapped_params,
)
sampler_binary.stepsampler = SliceSampler(
    nsteps=nsteps,
    generate_direction=generate_mixture_random_direction,
)

sampler_binary.run(show_status=True);
[ultranest] Sampling 400 live points from prior ...
[ultranest] Explored until L=-1e+02  .37 [-130.0117..-130.0115]*| it/evals=19379/751016 eff=2.5817% N=400  400   0 00 0 
[ultranest] Likelihood function evaluations: 751359
[ultranest]   logZ = -173.7 +- 0.2281
[ultranest] Effective samples strategy satisfied (ESS = 2948.1, need >400)
[ultranest] Posterior uncertainty strategy is satisfied (KL: 0.45+-0.05 nat, need <0.50 nat)
[ultranest] Evidency uncertainty strategy wants 398 minimum live points (dlogz from 0.19 to 0.57, need <0.5)
[ultranest]   logZ error budget: single: 0.32 bs:0.23 tail:0.01 total:0.23 required:<0.50
[ultranest] done iterating.
sampler_binary.plot()
plt.show()
../_images/82a401c48cd5d3952359995a8b019995e7ebe8989a53683b0eedeb3399a0fcb5.png ../_images/bb82e950e555d54cf3ebcae18f8a85cee9e0e78bd6f3f6f038dbb91d09feaffe.png ../_images/448ae8f1d47c0e0f6a48540bca23eff8ad7671cfa166aed80f5d9c50b394f1d1.png

One thing to note here is that the contrast is different from Calissendorff et al. (2023). This is most likely due to the fact they used a custom ePSF model instead of one from JWST1PASS. We will look into this in a future release.

med_p = dict(
    zip(
        model_binary.keys(),
        sampler_binary.results["posterior"]["median"],
    )
)
med_mod = model_binary.forward(med_p)

fig, axs = plot_with_diff(img, med_mod, scale="linear")
axs[0].set_title("Data")
axs[1].set_title("Median model")
axs[2].set_title("Difference")
plt.show()
../_images/dd706f3900e6d265e5e1dd096f138dba2db4208ed7165202e9d8702d8dc73354.png

In addition to comparing the median binary model, we can subtract only the primary to see the companion pop up.

med_mod_primary = model.forward(med_p)
fig, axs = plot_with_diff(img, med_mod_primary, scale="linear")
axs[0].set_title("Data")
axs[1].set_title("Median model")
axs[2].set_title("Difference")
plt.show()
../_images/28a26b66b785f1ccbfa7fd0c42b0bf8768238d3a480ad91f907d97b4ace8f82b.png
fig, axs = plot_flat_samples(
    img, err, model_binary.get_posterior_pred(sampler_binary.results["samples"].T, 100)
)
axs[0].set_title("Posterior samples vs image")
plt.show()
../_images/369d4425e3b22131a10824f343a07a0a7e23053e5d4396005fd53aee8aa6f1ca.png
median_sigma = sampler_binary.results["posterior"]["median"][
    model.keys().index("sigma")
]
fig, axs = plot_flat_samples(
    img,
    np.sqrt(err**2 + median_sigma**2),
    model_binary.get_posterior_pred(sampler_binary.results["samples"].T, 100),
)
axs[0].set_title("Posterior samples vs image (inflated errors)")
plt.show()
../_images/13c08573ce64bd3cb0535c66255fb83e577ce2ed367487697c285faec1d3ee12.png

Model comparison#

Let us now compare the single and binary models using the Bayes factor.

print(f"lnZ single: {sampler.results['logz']:.2f} +/- {sampler.results['logzerr']:.2f}")
print(
    f"lnZ binary: {sampler_binary.results['logz']:.2f} +/- {sampler_binary.results['logzerr']:.2f}"
)
lnK = sampler_binary.results["logz"] - sampler.results["logz"]
lnK_err = np.sqrt(
    sampler.results["logzerr"] ** 2 + sampler_binary.results["logzerr"] ** 2
)
print(f"lnK binary - single: {lnK:.2f} +/- {lnK_err:.2f}")
lnZ single: -332.98 +/- 0.40
lnZ binary: -173.69 +/- 0.35
lnK binary - single: 159.29 +/- 0.53

As expected, the binary model is overwhelmingly favored by the data.