Multiple instruments¶
It is quite typical to have RV observations from more than one instrument. These data should be analysed together for the detection of planets.
- Considering RVs from multiple instruments means adding new offset parameters to the model. By default, all these parameters share the same prior, but we'll see below how to change this.
- The same applies to the additional white noise (jitter). One jitter parameter is added per instrument, all sharing the same prior.
Newer versions of kima now allow for a stellar jitter parameter which is shared between all the instruments. By default, this parameter is fixed to zero to recover the old behaviour.
First of all, let's import the package
import kima
Detecting the planet around HD106252¶
First announced by Fischer et al. (2002), the giant planet orbiting HD106252 has been detected with a number of different instruments (Perrier et al. 2003, Butler et al. 2006, Wittenmyer et al.2009). In this example, we look at data from ELODIE, HET, HJS, and Lick.
Each of the datasets can be imported like this
from kima.examples.multi_instruments import HD106252_ELODIE
and a simple visualisation of the RVs is provided by the plot
method
HD106252_ELODIE.plot();
The combined data is also readily available
from kima.examples.multi_instruments import HD106252_combined
HD106252_combined.plot();
Note the clear offset between ELODIE data (in blue above) and the other datasets. The average RV has been subtracted from the HET, HJS, and Lick data, but not from ELODIE. This slightly convoluted situation is ideal to demonstrate how kima deals with multiple instruments, especially with the default priors.
We can now use one of the available models to analyse this dataset. We can either create the model
from kima import RVmodel
model = RVmodel(fix=False, npmax=1, data=HD106252_combined)
or import the multi_instruments
function from the examples, which does the same
thing (but can also run the model directly)
from kima.examples import multi_instruments
model = multi_instruments(run=False)
Notice how we set the number of planets to be free, implicitly assigning to it
a uniform prior between 0 and 1 (by setting npmax=1
).
This is all that is necessary to setup the data and model, and we are now ready to run the analysis. Notice that we didn't explicitly assign any priors (except for the number of planets). This means kima will use all the default priors.
Let's run the model for a few thousand steps and load the results
model, res = multi_instruments(run=True, load=True, steps=30_000)
log(Z) = -498.67164265769134 Information = 67.6310091586351 nats. Effective sample size = 2984.7236142899683
100%|██████████| 2984/2984 [00:00<00:00, 8048.22it/s]
Did we find HD106252 b?
res.plot_posterior_np();
Np probability ratios: []
Yes! In fact, even though the $N_p$ parameter was free, all the samples have $N_p=1$ because the posterior probability is so much higher than for $N_p=0$. So we're pretty sure about this detection. How does the fit look?
res.plot_random_samples();
The plot_random_samples
function just shows the Keplerian curves from a few
random posterior samples, together with the data. Note that the RV offsets for
each instrument are not subtracted from the data. This is on purpose: since
we're showing several posterior samples (50 by default), there are actually
several values for the offsets. Zooming in on one particular dataset helps (note
the several blue curves):
fig = res.plot_random_samples();
fig.axes[0].set_ylim(9600, 10200);
Looking at the posterior for the orbital period, semi-amplitude, and eccentricity of the planet clearly shows that the default priors are a bit too wide
res.plot_posterior_PKE(show_prior=True);
The situation is similar for the systemic velocity and for the RV offsets, as evidenced in the plots below, where we might not even see the posterior!
res.hist_vsys(show_prior=True)
[<Figure size 640x480 with 1 Axes>, <Figure size 1100x500 with 3 Axes>]
but they're there:
res.hist_vsys();
What we see here is that kima tries very hard to have default priors which will be appropriate for every RV dataset. However, this does sometimes mean that the priors are way too wide, which might hurt the performance of the sampler.
These issues can be easily solved by setting slightly more informative priors
for the RV offsets and some of the orbital parameters. We'll use a couple of
helper functions available in pykima.utils
to assign appropriate Gaussian
priors.
from kima.pykima.utils import (get_gaussian_prior_vsys,
get_gaussian_priors_individual_offsets)
model.Cprior = get_gaussian_prior_vsys(model.data, use_std=True)
model.Cprior
Gaussian(62; 102.095)
model.individual_offset_prior = get_gaussian_priors_individual_offsets(model.data, use_std=True)
model.individual_offset_prior
[Gaussian(15518.5; 96.3933), Gaussian(-71.9; 59.0117), Gaussian(-65.4; 61.2813)]
and also set a narrower prior for the semi-amplitude
model.conditional.Kprior = kima.distributions.Uniform(0, 500)
Let's run the model again using the new priors
kima.run(model, steps=30_000, num_threads=4)
res = kima.load_results()
log(Z) = -470.77601559796693 Information = 41.027161549708126 nats. Effective sample size = 3207.732713141128
100%|██████████| 3207/3207 [00:00<00:00, 11334.12it/s]
The priors for the systemic velocity and RV offsets are still relatively wide but much more comparable to the posteriors
res.hist_vsys(show_prior=True);
and for the orbital parameters
res.plot_posterior_PKE(show_prior=True);
In any case, the orbital parameters of HD106252 b are well recovered. The maximum likelihood solution provides an excellent fit to the data.
p = res.maximum_likelihood_sample()
res.phase_plot(p);
Sample with the highest likelihood value (logL = -422.84) -> might not be representative of the full posterior distribution jitter: [ 0. 6.40600629 0.92606019 10.59487361 3.81708227] number of planets: 1 orbital parameters: P K M0 e ω 1532.14695 140.23845 0.72411 0.48398 5.10110 instrument offsets: (relative to Lick) ELODIE HET HJS 15519.108 -98.130 -84.104 vsys: 6.95947983 0 1 1
In this example, we used some properties of the data to assign priors for a few parameters. Some Bayesians might not agree with this and, in general, they would be right. However, note in the figures above for the original analysis how there's just no posterior mass across very very wide regions of the prior. Upon realizing this, we tried to come up with more informative priors. That means that, in essence, we just performed a prior sensitivity analysis, and concluded that the posterior estimates for the parameters were consistent.
Note also that this is very different from setting a prior for the orbital period based on a periodogram analysis, for example, which is almost never justified.