Project 1

Transiting Exoplanet

Your team detected a planet candidate in the data from the Transiting Exoplanet Survey Satellite (TESS). The data is very noisy, but you are relatively confident the planet is there.
Based on the TESS data, you can predict that the center of the next transit will be between the time t = 1.212 days and t = 1.362 days [Note: t is measured relative to an arbitrary reference time]. The TESS data also tell you that the radius of the planet is between 1% and 10% the radius of the star. Unfortunately, you don’t know where the planet crosses the star, i.e. the impact parameter is unconstrained between 0 and 1. Assume that the orbital period is exactly known to be 3.0 days. You wrote a proposal and successfully convinced the time allocation committee of the James Webb Space Telescope (JWST) to obtain new very precise data for you. YAY!!! You receive the data in Project1_JWST_data.csv. The errors are 10-4=0.01% on each data point in the light curve. From these data, you want to determine new estimates with uncertainties for the transit time, the planet-to-star radius ratio, and the impact parameter.

We use the following Julia packages:

Transits.jl: flexible and powerful occultation curves with limb darkening.
Turing.jl: Bayesian inference with probabilistic programming
madsjulia/AffineInvariantMCMC.jl: Affine Invariant Markov Chain Monte Carlo (MCMC) Ensemble sampler
chalk-lab/NestedSamplers.jl: Implementations of single and multi-ellipsoid nested sampling
bat/BAT.jl: A Bayesian Analysis Toolkit in Julia

Reference Python packages:

batman: a Python package for fast calculation of exoplanet transit light curves
dynesty: Dynamic Nested Sampling package for computing Bayesian posteriors and evidences

Note

In the updated version of the project, we use Transits.jl instead of batman, which is more performant and provides features like automatic differentiation.

Introduction

Code

dir = "docs/courses/epss298_DataAnalysis"
if isdir(dir)
    cd(dir)
    Pkg.activate(".")
    Pkg.resolve()
    Pkg.instantiate()
    cd("projects")
end

Code

using CSV, DataFrames
using CairoMakie
path = "Project1_JWST_data.csv"
data = CSV.read(path, DataFrame)

f = Figure()
markersize = 4
ax = Axis(f[1, 1], xlabel="Time", ylabel="Flux")
scatterlines!(ax, data.time, data.flux; markersize)
f

Part A

Step 1: Define the physical model

Write a function for your physical “forward” model that takes in a vector with these three parameters:

Code

using Orbits
using Transits

t0_i = 1.311544
b_i = 0.1
rp_i = 0.01

create_orbit(t0, b; a=15, per=3, kw...) = KeplerianOrbit(; period=per, t0, a, b, kw...)
create_orbit(nt; kw...) = create_orbit(nt[1], nt[2]; kw...)

function light_curve(time, t0, b, rp; u=[0.3, 0.3], kwargs...)
    orbit = create_orbit(t0, b; kwargs...)
    ld = QuadLimbDark(u)
    return @. ld(orbit, time, rp)
end

flux = light_curve(data.time, t0_i, b_i, rp_i)
scatterlines!(f[1, 1], data.time, flux; markersize)
f

Code

function light_curve_n(time, params::Vararg{Union{Tuple,NamedTuple},N}; ld=QuadLimbDark([0.3, 0.3]), kwargs...) where {N}
    rps = getindex.(params, 3)
    orbits = create_orbit.(params; kwargs...)
    map(time) do t
        s = reduce(+, ntuple(i -> ld(orbits[i], t, rps[i]), N))
        s - (N - 1) * one(s)
    end
end

light_curve_n (generic function with 1 method)

Step 2: Define the Bayesian analysis

Write the given prior information as a log-prior function

Write a log-likelihood function

Write a log-probability function

Vanilla log-prior, log-likelihood, and log-probability functions could be written as:

log_prior(x, (lb, ub)) = lb <= x <= ub ? 0.0 : -Inf

# Transit time between 1.212 and 1.362 days
# Planet radius between 1% and 10% of star radius
# Impact parameter between 0 and 1
log_prior(t0, rp, b) = log_prior(t0, (1.212, 1.362)) + 
    log_prior(rp, (0.01, 0.10)) + 
    log_prior(b, (0.0, 1.0))

function log_likelihood(params, time, obs, error;
    per=3.0, # Period in days
    a=15.0, # Semi-major axis in stellar radii
    u=[0.3, 0.3] # Limb darkening coefficients
)
    t0, rp, b = params
    inc = acosd(b / a)  # Inclination in degrees
    model_flux = light_curve(time; t0, per, rp, a, inc, u)

    # Calculate log-likelihood (assuming Gaussian errors)
    residuals = obs .- model_flux
    chi_squared = sum((residuals ./ error) .^ 2)
    -0.5 * chi_squared
end

log_probability(params, time, obs, error) = log_prior(params) + log_likelihood(params, time, obs, error)

Using Turing.jl, we can write the same model more compactly:

Code

using Turing

@model function transit_model(time, flux; error=1e-4)
    t0 ~ Uniform(1.212, 1.362)
    b ~ Uniform(0.0, 0.8)
    rp ~ Uniform(0.01, 0.10)
    model_flux = light_curve(time, t0, b, rp)
    return flux ~ MvNormal(model_flux, error)
end

model = transit_model(data.time, data.flux)

DynamicPPL.Model{typeof(transit_model), (:time, :flux), (:error,), (), Tuple{Vector{Float64}, Vector{Float64}}, Tuple{Float64}, DynamicPPL.DefaultContext}(transit_model, (time = [1.1620000000000001, 1.162836120401338, 1.1636722408026758, 1.1645083612040135, 1.1653444816053513, 1.1661806020066892, 1.167016722408027, 1.1678528428093646, 1.1686889632107025, 1.1695250836120403  …  1.40447491638796, 1.4053110367892978, 1.4061471571906357, 1.4069832775919733, 1.4078193979933111, 1.408655518394649, 1.4094916387959868, 1.4103277591973247, 1.4111638795986623, 1.4120000000000001], flux = [0.99989143693967, 1.0000997345446583, 1.0000282978498052, 0.9998493705286082, 0.9999421399748032, 1.0001651436537098, 0.9997573320756606, 0.9999571087371144, 1.0001265936258705, 0.9999133259597734  …  0.9998925233406881, 1.000066831686674, 1.0000955832355167, 0.9999122386413373, 0.9998076284270303, 1.000069578731914, 1.0001875800546713, 1.0000415694539893, 1.000016054442148, 1.000081976060961]), (error = 0.0001,), DynamicPPL.DefaultContext())

Step 3: Demonstrate solution on a grid

Determine new estimates with uncertainties of the planet-to-star radius ratio and the impact parameter. Fix the transit mid-time to t0 = 1.311544 days. Make a 50x50 grid that explores planet-to-star radius ratios between 0.05275 and 0.53505 as well as impact parameters between 0 and 0.3.

1. Plot the joint posterior distribution with and mark the best fit with a marker

Code

using LinearAlgebra  # For eigen decomposition and other matrix operations
using Statistics     # For statistical functions
using StatsBase

fixed_t0 = 1.311544

n_grid = 50
rp_grid = range(0.05275, 0.053505, length=n_grid)
b_grid = range(0.0, 0.3, length=n_grid - 10)

log_posterior_grid = [
    logjoint(model, (; rp, b, t0=fixed_t0))
    for rp in rp_grid, b in b_grid
]

posterior_grid = exp.(log_posterior_grid .- maximum(log_posterior_grid)) # (subtract max to avoid numerical issues)
posterior_grid = posterior_grid ./ sum(posterior_grid) # Normalize the posterior

# Find the maximum posterior point
max_idx = argmax(posterior_grid)
best_rp = rp_grid[max_idx[1]]
best_b = b_grid[max_idx[2]]
@info "Best fit values" best_rp best_b

b_label = "Impact parameter (b)"
rp_label = "Planet-to-star radius ratio (rp)"
fig = Figure(size=(1000, 800))
ax1 = Axis(fig[1, 1], xlabel=rp_label, ylabel=b_label, title="Joint Posterior Distribution")

hm = heatmap!(ax1, rp_grid, b_grid, posterior_grid)
scatter!(ax1, [best_rp], [best_b], color=:red, markersize=15)
Colorbar(fig[1, 1, Right()], hm, label="Posterior Probability")
fig

┌ Info: Best fit values
│   best_rp = 0.053150612244897956
└   best_b = 0.2076923076923077

2. Marginalized posterior distributions

Plot the marginalized posterior distributions for the planet-to-star radius ratio and the impact parameters.

Code

rp_posterior = vec(sum(posterior_grid, dims=2))
rp_posterior = weights(rp_posterior ./ sum(rp_posterior))  # Normalize

b_posterior = vec(sum(posterior_grid, dims=1))
b_posterior = weights(b_posterior ./ sum(b_posterior))  # Normalize

ax2 = Axis(fig[2, 1], xlabel=rp_label, ylabel="Probability")
lines!(ax2, rp_grid, rp_posterior)
ax3 = Axis(fig[1, 2], xlabel=b_label, ylabel="Probability")
lines!(ax3, b_grid, b_posterior)
fig

3. Best estimates and uncertainties

Calculate the mean and +/-1 sigma uncertainties for both parameters. Mark the mean value with a solid vertical line in the figure created in Step 3.2. Make the +/-1 sigma uncertainties with dashed vertical lines. Report the best estimates (mean and uncertainty) for both parameters quantitatively.

Code

using Measurements
# For rp
rp_mean = mean(rp_grid, rp_posterior)
rp_std = std(rp_grid, rp_posterior)
rp = measurement(rp_mean, rp_std)

# For b
b_mean = mean(b_grid, b_posterior)
b_std = std(b_grid, b_posterior)
b = measurement(b_mean, b_std)

# Add mean and +/- 1 sigma to the marginalized plots
vlines!(ax2, [rp_mean], color=:black, linewidth=2)
vlines!(ax2, [rp_mean - rp_std, rp_mean + rp_std], color=:black, linestyle=:dash)

vlines!(ax3, [b_mean], color=:black, linewidth=2)
vlines!(ax3, [b_mean - b_std, b_mean + b_std], color=:black, linestyle=:dash)

@info "Planet-to-star radius ratio" rp
@info "Impact parameter" b
fig

┌ Info: Planet-to-star radius ratio
└   rp = 0.05313 ± 0.00011
┌ Info: Impact parameter
└   b = 0.204 ± 0.0089

4. Covariance matrix and confidence regions

Calculate the covariance matrix that best represents the joint posterior distribution. What is the correlation coefficient between the planet-to-star radius ratios and the impact parameter? Plot the 68% and 95% confidence regions onto the figure created in Step 3.1.

Code

"""Calculate covariance matrix elements"""
function covariance_matrix(p, x, y)
    x_2d = repeat(x, 1, length(y))
    y_2d = repeat(y', length(x), 1)
    wp = weights(p)
    x_mean = mean(x_2d, wp)
    y_mean = mean(y_2d, wp)

    # Calculate covariance matrix elements
    cov_x_x = mean((x_2d .- x_mean) .^ 2, wp)
    cov_y_y = mean((y_2d .- y_mean) .^ 2, wp)
    cov_x_y = mean((x_2d .- x_mean) .* (y_2d .- y_mean), wp)
    return [cov_x_x cov_x_y; cov_x_y cov_y_y]
end

cov_matrix = covariance_matrix(posterior_grid, rp_grid, b_grid)
@info "Covariance matrix:" cov_matrix

# Calculate correlation coefficient
corr_coef = cov_matrix[1, 2] / (sqrt(cov_matrix[1, 1]) * sqrt(cov_matrix[2, 2]))
@info "Correlation coefficient:" corr_coef

┌ Info: Covariance matrix:
│   cov_matrix =
│    2×2 Matrix{Float64}:
│     1.22606e-8  3.77043e-7
└     3.77043e-7  7.99982e-5
┌ Info: Correlation coefficient:
└   corr_coef = 0.3807109735554177

Code

using Distributions

function getellipsepoints(cx, cy, rx, ry, θ; length=100)
    t = range(0, 2π; length)
    ellipse_x_r = @. rx * cos(t)
    ellipse_y_r = @. ry * sin(t)
    R = [cos(θ) sin(θ); -sin(θ) cos(θ)]
    r_ellipse = [ellipse_x_r ellipse_y_r] * R
    x = @. cx + r_ellipse[:, 1]
    y = @. cy + r_ellipse[:, 2]
    (x, y)
end

function getellipsepoints(μ, Σ, confidence=0.95)
    quant = quantile(Chisq(2), confidence) |> sqrt

    egvs = eigvals(Σ)
    if egvs[1] > egvs[2]
        idxmax = 1
        largestegv = egvs[1]
        smallesttegv = egvs[2]
    else
        idxmax = 2
        largestegv = egvs[2]
        smallesttegv = egvs[1]
    end

    rx = quant * sqrt(largestegv)
    ry = quant * sqrt(smallesttegv)

    eigvecmax = eigvecs(Σ)[:, idxmax]
    θ = atan(eigvecmax[2] / eigvecmax[1])
    if θ < 0
        θ += 2π
    end
    getellipsepoints(μ..., rx, ry, θ)
end

getellipsepoints (generic function with 3 methods)

Code

μ = [rp_mean, b_mean]
Σ = cov_matrix

lines!(ax1, getellipsepoints(μ, Σ)..., label="95% confidence interval")
lines!(ax1, getellipsepoints(μ, Σ, 0.5)..., label="50% confidence interval")
fig

Part B

Step 4: Solve the full problem using MCMC and Nested Sampling

Determine new estimates with uncertainties of the transit time, the planet-to-star radius ratio, and the impact parameter. 1. Using Emcee. Make all possible plots and quantitative assessments to convince the reviewer that your results are converged 2. Using Dynesty. Make all possible plots and quantitative assessments to convince the reviewer that your results are converged

Affine Invariant MCMC (Emcee-like)

Code

chain = sample(model, Emcee(20), MCMCThreads(), 1000, 4; progress=false)

Chains MCMC chain (1000×4×80 Array{Float64, 3}):

Iterations        = 1:1:1000
Number of chains  = 80
Samples per chain = 1000
parameters        = t0, b, rp
internals         = lp

Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   e ⋯
      Symbol   Float64   Float64   Float64    Float64    Float64   Float64     ⋯

          t0    1.3047    0.0255    0.0019   415.4429   181.4717    1.2661     ⋯
           b    0.2507    0.1566    0.0110   425.1972   197.5850    1.2414     ⋯
          rp    0.0504    0.0127    0.0009   396.0070   209.2546    1.2720     ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

          t0    1.2120    1.3115    1.3115    1.3116    1.3144
           b    0.1348    0.1983    0.2052    0.2134    0.7965
          rp    0.0102    0.0530    0.0531    0.0532    0.0547

Code

using StatsPlots
StatsPlots.plot(chain)

Precompiling StatsPlots...
  32446.9 ms  ✓ Plots
   2888.2 ms  ✓ Plots → UnitfulExt
   3490.8 ms  ✓ StatsPlots
  3 dependencies successfully precompiled in 40 seconds. 245 already precompiled.
Precompiling FileIOExt...
   1922.7 ms  ✓ Plots → FileIOExt
  1 dependency successfully precompiled in 2 seconds. 184 already precompiled.
Precompiling GeometryBasicsExt...
   2192.6 ms  ✓ Plots → GeometryBasicsExt
  1 dependency successfully precompiled in 3 seconds. 191 already precompiled.

NestedSamplers (Dynesty-like)

chalk-lab/NestedSamplers.jl: Implementations of single and multi-ellipsoid nested sampling

Code

using NestedSamplers

loglike = let m = model
    (X) -> loglikelihood(m, (t0=X[1], rp=X[2], b=X[3]))
end

priors = [Uniform(1.212, 1.362), Uniform(0.01, 0.10), Uniform(0.0, 0.8)]
nmodel = NestedModel(loglike, priors)

sampler = Nested(3, 2000)
chain, state = sample(nmodel, sampler, progress=false)

(MCMC chain (36363×4×1 Array{Float64, 3}), (it = 36363, us = [0.6635962495452576 0.6635072052339818 … 0.6633169278239543 0.6636972124209838; 0.4782675507992006 0.4790646102116731 … 0.4798669631935222 0.47917387754731156; 0.24675568766610279 0.24787134103717887 … 0.2601812663458868 0.25664850314574833], vs = [1.3115394374317888 1.3115260807850972 … 1.3114975391735932 1.3115545818631476; 0.05304407957192806 0.05311581491905059 … 0.053188026687417005 0.05312564897925805; 0.19740455013288224 0.1982970728297431 … 0.20814501307670943 0.20531880251659868], logl = 2233.415712293593, logz = 2217.0517405351015, logzerr = 0.086300490861, logvol = -24.782402459542084, since_update = 6614, has_bounds = true, active_bound = MultiEllipsoid{Float64}(ndims=3)))

Code

StatsPlots.plot(chain)

Hamiltonian Monte Carlo (HMC)

Since now the light curve is auto-differentiated, we can use algorithms like No U-Turn Sampler (NUTS) which uses Hamiltonian dynamics (very efficient).

Code

chain = sample(model, NUTS(), 2000, progress=false)
StatsPlots.plot(chain)

┌ Info: Found initial step size
└   ϵ = 0.00625

Code

using PairPlots: pairplot
pairplot(chain)

Performance comparison

Code

using Chairmarks
display(@b sample(model, NUTS(), 2000; progress=false))
display(@b sample(model, Emcee(16), 2000; progress=false))
display(@b sample(nmodel, sampler, progress=false))

┌ Info: Found initial step size
└   ϵ = 0.0243865966796875

2.066 s (12734599 allocs: 1.134 GiB, 8.67% gc time, 20.47% compile time 100.00% of which was recompilation, without a warmup)

3.204 s (14276589 allocs: 1.104 GiB, 6.60% gc time, 20.96% compile time 9.36% of which was recompilation, without a warmup)

8.243 s (9489346 allocs: 2.717 GiB, 17.35% gc time, 0.27% compile time, without a warmup)

We can see that NUTS is the fastest, followed by Emcee, and then NestedSamplers. Also NUTS seems to converge faster than Emcee and NestedSamplers.

Step 5: Investigate whether you model is a good fit to the data

Investigate the residuals. What do you think? Quantitatively and graphically assess whether your model is a good fit to the data.

Code

using Statistics

function selected_params(chain)
    parameters = tuple(chain.name_map.parameters...)
    values = map(p -> median(chain[p]), parameters)
    return NamedTuple{parameters}(values)
end

function split_namedtuple(nt, every)
    ks = keys(nt)
    map(1:length(nt)/every) do i
        indices = (i-1)*every+1:i*every
        nt[ks[indices]]
    end
end

model_flux = light_curve(data.time, selected_params(chain)...)
residuals = data.flux .- model_flux

f1 = Figure()
markersize = 4
ax1 = Axis(f1[1, 1], ylabel="Flux")
scatterlines!(ax1, data.time, data.flux; markersize, label="Data")
scatterlines!(ax1, data.time, model_flux; markersize, label="One-body Model")

ax2 = Axis(f1[2, 1], xlabel="Time", ylabel="Residuals")
scatter!(ax2, data.time, residuals; label="One-body Model")
Axis(f1[3, 1], xlabel="Residual", ylabel="Count", title="Histogram of Residuals")
hist!(f1[3, 1], residuals, bins=30)
display(f1)

@info "Mean residual: " mean(residuals)
@info "Std of residuals: " std(residuals)

┌ Info: Mean residual: 
└   mean(residuals) = -8.127155615000141e-6
┌ Info: Std of residuals: 
└   std(residuals) = 0.00013008549471644435

Step 6: Refine your physical model

What could explain your findings in Part 4? Tip: When a distant observer sees a transit of the Earth in front of the sun, would it only be the Earth that is transiting at that time?

There is a increase in the residuals at the time of the \(1.33\). This could be due to the presence of another object transiting the star. For example, if the system contains both a planet and a moon (or another planet), the observed light curve would show additional dips or asymmetries not explained by a single-object model.

New physical “forward” model

Code a new physical “forward” model and log-likelihood function that better explains the data.

Code

@model function transit_model_two_bodies(time, flux; error=1e-4)
    # Priors for planet
    t0_1 ~ Uniform(1.212, 1.362)
    b_1 ~ Uniform(0.0, 0.8)
    rp_1 ~ Uniform(0.01, 0.10)
    # Priors for moon (or second planet)
    t0_2 ~ Uniform(1.212, 1.362)
    b_2 ~ Uniform(0.0, 0.8)
    rp_2 ~ Uniform(0.01, 0.10)

    model_flux = light_curve_n(time, (t0_1, b_1, rp_1), (t0_2, b_2, rp_2))
    return flux ~ MvNormal(model_flux, error)
end

transit_model_two_bodies (generic function with 2 methods)

Find the constraints of both objects

Find the constraints of both objects. Treat the moon like another body transiting shortly before or after. Use the same prior for the moon. That makes six parameters.

For performance reason, we use NUTS instead of NestedSamplers.

Code

model2 = transit_model_two_bodies(data.time, data.flux)
chain2 = sample(model2, NUTS(), 3000, progress=false)

┌ Info: Found initial step size
└   ϵ = 0.003125

Chains MCMC chain (3000×18×1 Array{Float64, 3}):

Iterations        = 1001:1:4000
Number of chains  = 1
Samples per chain = 3000
Wall duration     = 36.41 seconds
Compute duration  = 36.41 seconds
parameters        = t0_1, b_1, rp_1, t0_2, b_2, rp_2
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ⋯
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

        t0_1    1.3021    0.0003    0.0000   1857.5185   2189.9759    1.0000   ⋯
         b_1    0.5093    0.0168    0.0003   2582.8198   1883.8038    0.9997   ⋯
        rp_1    0.0200    0.0007    0.0000   1360.2347   1534.9662    0.9997   ⋯
        t0_2    1.3120    0.0001    0.0000   2355.3745   2501.7944    1.0001   ⋯
         b_2    0.2142    0.0084    0.0002   2238.3033   1866.4109    0.9998   ⋯
        rp_2    0.0501    0.0003    0.0000   1294.6839   1851.1773    0.9997   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

        t0_1    1.3015    1.3019    1.3021    1.3023    1.3028
         b_1    0.4762    0.4983    0.5095    0.5204    0.5422
        rp_1    0.0185    0.0195    0.0200    0.0205    0.0214
        t0_2    1.3119    1.3120    1.3120    1.3121    1.3121
         b_2    0.1971    0.2086    0.2144    0.2198    0.2304
        rp_2    0.0496    0.0499    0.0501    0.0503    0.0506

Code

StatsPlots.plot(chain2) |> display
pairplot(chain2)

Convergence assessment

Make plots to convince the reviewer that your results are converged

The figure below shows the trace and autocorrelation plots for all parameters. The chains appear well-mixed and stationary, indicating convergence. The Monte Carlo standard error is small further supporting convergence.

Reference: Autocorrelation analysis & convergence — emcee

Code

@info "Monte Carlo standard error" mcse(chain2)
autocorplot(chain2)

┌ Info: Monte Carlo standard error
│   mcse(chain2) =
│    MCSE
│      parameters      mcse 
│          Symbol   Float64 
│    
│            t0_1    0.0000
│             b_1    0.0003
│            rp_1    0.0000
│            t0_2    0.0000
│             b_2    0.0002
│            rp_2    0.0000
└

Model comparison

Determine quantitatively whether the model justifies applying this more complex model? How confident are you that you detected a moon?

Here we do a posterior predictive checks to compare how well each model reproduces the observed data

Code

function maximum_likelihood_test(model1, chain1, model2, chain2)
    logjoint(model1, selected_params(chain1)) / logjoint(model2, selected_params(chain2))
end

@info "Model 2 is more likely than model 1 by a factor of " 1 / maximum_likelihood_test(model, chain, model2, chain2)

model_flux2 = light_curve_n(data.time, split_namedtuple(selected_params(chain2), 3)...)
residuals2 = data.flux .- model_flux2

scatterlines!(f1[1, 1], data.time, model_flux2; markersize, label="Two-body Model")
scatter!(f1[2, 1], data.time, residuals2, label="Two-body Model")
hist!(f1[3, 1], residuals2, bins=30)
axislegend(ax1)
axislegend(ax2)
display(f1)

@info "Mean residual: " mean(residuals2)
@info "Std of residuals: " std(residuals2)

┌ Info: Model 2 is more likely than model 1 by a factor of 
└   1 / maximum_likelihood_test(model, chain, model2, chain2) = 1.0456563913510744
┌ Info: Mean residual: 
└   mean(residuals2) = -3.279301983500904e-6
┌ Info: Std of residuals: 
└   std(residuals2) = 0.0001022055861763318

The two-body yields lower residuals than the single-body model, indicating a better fit.

Summary

Summarize your results for the parameters for each object?

The summary statistics for the two-body model are shown below:

Code

chain2

Chains MCMC chain (3000×18×1 Array{Float64, 3}):

Iterations        = 1001:1:4000
Number of chains  = 1
Samples per chain = 3000
Wall duration     = 36.41 seconds
Compute duration  = 36.41 seconds
parameters        = t0_1, b_1, rp_1, t0_2, b_2, rp_2
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ⋯
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

        t0_1    1.3021    0.0003    0.0000   1857.5185   2189.9759    1.0000   ⋯
         b_1    0.5093    0.0168    0.0003   2582.8198   1883.8038    0.9997   ⋯
        rp_1    0.0200    0.0007    0.0000   1360.2347   1534.9662    0.9997   ⋯
        t0_2    1.3120    0.0001    0.0000   2355.3745   2501.7944    1.0001   ⋯
         b_2    0.2142    0.0084    0.0002   2238.3033   1866.4109    0.9998   ⋯
        rp_2    0.0501    0.0003    0.0000   1294.6839   1851.1773    0.9997   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

        t0_1    1.3015    1.3019    1.3021    1.3023    1.3028
         b_1    0.4762    0.4983    0.5095    0.5204    0.5422
        rp_1    0.0185    0.0195    0.0200    0.0205    0.0214
        t0_2    1.3119    1.3120    1.3120    1.3121    1.3121
         b_2    0.1971    0.2086    0.2144    0.2198    0.2304
        rp_2    0.0496    0.0499    0.0501    0.0503    0.0506

Step 7: Application

Describe in a few sentences one example for a problem or data set that you could analogously solve in your research domain?

In plasma physics, particle velocity distributions are often non-Maxwellian and may be composed of several populations (core, halo, beam). By modeling the observed distribution as the sum of several Maxwellian or kappa distributions, and using Bayesian inference to fit the parameters (density, temperature, drift speed for each component), you can determine how many components are justified and estimate their properties, just as you determine the number of transiting bodies in the exoplanet problem.

Old codes (using `Batman`)

Julia interface to Python and Batman

#| output: false
using CondaPkg
using PythonCall
import PythonCall: Py
pkg = "batman-package"
deps = CondaPkg.read_deps()["deps"]
haskey(deps, pkg) || CondaPkg.add(pkg)
# CondaPkg.add("corner")

const batman = pyimport("batman")
const corner = pyimport("corner")
const np = pyimport("numpy")

using DrWatson: @dict

function set_params!(params; kwargs...)
    for (key, value) in kwargs
        setproperty!(params, key, value)
    end
    return params
end

"""Like `batman.TransitParams` in Python, but with some keyword arguments for convenience."""
function TransitParams(; ecc=0.0, w=0.0, limb_dark="quadratic", kwargs...)
    params = batman.TransitParams()
    set_params!(params; ecc, w, limb_dark)
    set_params!(params; kwargs...)
    return params
end

TransitModel(params, time) = batman.TransitModel(params, Py(time).to_numpy())

light_curve(model::Py, params::Py) = PyArray(model.light_curve(params))
function light_curve(params, time)
    m = TransitModel(params, time)
    return light_curve(m, params)
end
light_curve(time; kwargs...) = light_curve(TransitParams(; kwargs...), time)

inc(a, b) = acosd(b / a)
inc_i = acosd(b_i / a_i)

flux = let rp = rp_i, t0 = t0_i, b = b_i, a = a_i, u = u_i, per = per_i
    inc = acosd(b / a)
    light_curve(data.time; t0, per, rp, a, inc, u)
end

Emcee sampler

inc(p::Py, b) = inc(pyconvert(Any, p.a), b)

@model function transit_model(time, flux, params, trans_model=TransitModel(params, data.time); error=1e-4)
    # Priors
    t0 ~ Uniform(1.212, 1.362)
    rp ~ Uniform(0.01, 0.10)
    b ~ Uniform(0.0, 0.99)
    params = set_params!(params; t0, rp, inc=inc(params, b))
    model_flux = light_curve(trans_model, params)
    return flux ~ MvNormal(model_flux, error)
end

params = TransitParams(; rp=rp_i, t0=t0_i, per=3.0, a=a_i, u=u_i, inc=inc_i)
trans_model = TransitModel(params, data.time)
model = transit_model(data.time, data.flux, params)
chain = sample(model, Emcee(100), 10000; progress=false)

Reference

DACE tutorial on photometry
Dynamic Hamiltonian Monte Carlo (HMC)
- tpapp/DynamicHMC.jl: Implementation of robust dynamic Hamiltonian Monte Carlo methods (NUTS) in Julia

Introduction

Part A

Step 1: Define the physical model

Step 2: Define the Bayesian analysis

Step 3: Demonstrate solution on a grid

1. Plot the joint posterior distribution with and mark the best fit with a marker

2. Marginalized posterior distributions

3. Best estimates and uncertainties

4. Covariance matrix and confidence regions

Part B

Step 4: Solve the full problem using MCMC and Nested Sampling

Affine Invariant MCMC (Emcee-like)

NestedSamplers (Dynesty-like)

Hamiltonian Monte Carlo (HMC)

Performance comparison

Step 5: Investigate whether you model is a good fit to the data

Step 6: Refine your physical model

New physical “forward” model

Find the constraints of both objects

Convergence assessment

Model comparison

Summary

Multi-modal solution

Step 7: Application

Old codes (using Batman)

Julia interface to Python and Batman

Reference

Old codes (using `Batman`)