Multiple instruments¶

It is quite typical to have RV observations from more than one instrument. These data should be analysed together for the detection of planets.

Note

Considering RVs from multiple instruments means adding new offset parameters to the model. By default, all these parameters share the same prior, but we'll see below how to change this.
The same applies to the additional white noise (jitter). One jitter parameter is added per instrument, all sharing the same prior.

New

Newer versions of kima now allow for a stellar jitter parameter which is shared between all the instruments. By default, this parameter is fixed to zero to recover the old behaviour.

First of all, let's import the package

In [1]:

Copied!

import kima
import kima

Detecting the planet around HD106252¶

First announced by Fischer et al. (2002), the giant planet orbiting HD106252 has been detected with a number of different instruments (Perrier et al. 2003, Butler et al. 2006, Wittenmyer et al.2009). In this example, we look at data from ELODIE, HET, HJS, and Lick.

Each of the datasets can be imported like this

In [2]:

Copied!

from kima.examples.multi_instruments import HD106252_ELODIE
from kima.examples.multi_instruments import HD106252_ELODIE

and a simple visualisation of the RVs is provided by the plot method

In [3]:

Copied!

HD106252_ELODIE.plot();
HD106252_ELODIE.plot();

No description has been provided for this image

The combined data is also readily available

In [4]:

Copied!

from kima.examples.multi_instruments import HD106252_combined

HD106252_combined.plot();
from kima.examples.multi_instruments import HD106252_combined

HD106252_combined.plot();

Note the clear offset between ELODIE data (in blue above) and the other datasets. The average RV has been subtracted from the HET, HJS, and Lick data, but not from ELODIE. This slightly convoluted situation is ideal to demonstrate how kima deals with multiple instruments, especially with the default priors.

We can now use one of the available models to analyse this dataset. We can either create the model

In [5]:

Copied!

from kima import RVmodel

model = RVmodel(fix=False, npmax=1, data=HD106252_combined)
from kima import RVmodel

model = RVmodel(fix=False, npmax=1, data=HD106252_combined)

or import the multi_instruments function from the examples, which does the same thing (but can also run the model directly)

In [6]:

Copied!

from kima.examples import multi_instruments

model = multi_instruments(run=False)
from kima.examples import multi_instruments

model = multi_instruments(run=False)

Notice how we set the number of planets to be free, implicitly assigning to it a uniform prior between 0 and 1 (by setting npmax=1).

This is all that is necessary to setup the data and model, and we are now ready to run the analysis. Notice that we didn't explicitly assign any priors (except for the number of planets). This means kima will use all the default priors.

Let's run the model for a few thousand steps and load the results

In [7]:

Copied!

model, res = multi_instruments(run=True, load=True, steps=30_000)
model, res = multi_instruments(run=True, load=True, steps=30_000)

log(Z) = -498.67164265769134
Information = 67.6310091586351 nats.
Effective sample size = 2984.7236142899683

100%|██████████| 2984/2984 [00:00<00:00, 8048.22it/s]

Did we find HD106252 b?

In [8]:

Copied!

res.plot_posterior_np();
res.plot_posterior_np();

Np probability ratios:  []

Yes! In fact, even though the $N_p$ parameter was free, all the samples have $N_p=1$ because the posterior probability is so much higher than for $N_p=0$. So we're pretty sure about this detection. How does the fit look?

In [9]:

Copied!

res.plot_random_samples();
res.plot_random_samples();

The plot_random_samples function just shows the Keplerian curves from a few random posterior samples, together with the data. Note that the RV offsets for each instrument are not subtracted from the data. This is on purpose: since we're showing several posterior samples (50 by default), there are actually several values for the offsets. Zooming in on one particular dataset helps (note the several blue curves):

In [10]:

Copied!

fig = res.plot_random_samples();
fig.axes[0].set_ylim(9600, 10200);
fig = res.plot_random_samples();
fig.axes[0].set_ylim(9600, 10200);

Looking at the posterior for the orbital period, semi-amplitude, and eccentricity of the planet clearly shows that the default priors are a bit too wide

In [11]:

Copied!

res.plot_posterior_PKE(show_prior=True);
res.plot_posterior_PKE(show_prior=True);

The situation is similar for the systemic velocity and for the RV offsets, as evidenced in the plots below, where we might not even see the posterior!

In [12]:

Copied!

res.hist_vsys(show_prior=True)
res.hist_vsys(show_prior=True)

Out[12]:

[<Figure size 640x480 with 1 Axes>, <Figure size 1100x500 with 3 Axes>]

but they're there:

In [13]:

Copied!

res.hist_vsys();
res.hist_vsys();

What we see here is that kima tries very hard to have default priors which will be appropriate for every RV dataset. However, this does sometimes mean that the priors are way too wide, which might hurt the performance of the sampler.

These issues can be easily solved by setting slightly more informative priors for the RV offsets and some of the orbital parameters. We'll use a couple of helper functions available in pykima.utils to assign appropriate Gaussian priors.

In [14]:

Copied!

from kima.pykima.utils import (get_gaussian_prior_vsys, 
                               get_gaussian_priors_individual_offsets)
from kima.pykima.utils import (get_gaussian_prior_vsys, 
                               get_gaussian_priors_individual_offsets)

In [15]:

Copied!

model.Cprior = get_gaussian_prior_vsys(model.data, use_std=True)
model.Cprior
model.Cprior = get_gaussian_prior_vsys(model.data, use_std=True)
model.Cprior

Out[15]:

Gaussian(62; 102.095)

In [16]:

Copied!

model.individual_offset_prior = get_gaussian_priors_individual_offsets(model.data, use_std=True)
model.individual_offset_prior
model.individual_offset_prior = get_gaussian_priors_individual_offsets(model.data, use_std=True)
model.individual_offset_prior

Out[16]:

[Gaussian(15518.5; 96.3933),
 Gaussian(-71.9; 59.0117),
 Gaussian(-65.4; 61.2813)]

and also set a narrower prior for the semi-amplitude

In [17]:

Copied!

model.conditional.Kprior = kima.distributions.Uniform(0, 500)
model.conditional.Kprior = kima.distributions.Uniform(0, 500)

Let's run the model again using the new priors

In [18]:

Copied!

kima.run(model, steps=30_000, num_threads=4)
kima.run(model, steps=30_000, num_threads=4)

In [21]:

Copied!

res = kima.load_results()
res = kima.load_results()

log(Z) = -470.77601559796693
Information = 41.027161549708126 nats.
Effective sample size = 3207.732713141128

100%|██████████| 3207/3207 [00:00<00:00, 11334.12it/s]

The priors for the systemic velocity and RV offsets are still relatively wide but much more comparable to the posteriors

In [26]:

Copied!

res.hist_vsys(show_prior=True);
res.hist_vsys(show_prior=True);

and for the orbital parameters

In [23]:

Copied!

res.plot_posterior_PKE(show_prior=True);
res.plot_posterior_PKE(show_prior=True);

In any case, the orbital parameters of HD106252 b are well recovered. The maximum likelihood solution provides an excellent fit to the data.

In [24]:

Copied!

p = res.maximum_likelihood_sample()
res.phase_plot(p);
p = res.maximum_likelihood_sample()
res.phase_plot(p);

Sample with the highest likelihood value (logL = -422.84)
-> might not be representative of the full posterior distribution

jitter:
[ 0.          6.40600629  0.92606019 10.59487361  3.81708227]
number of planets:  1
orbital parameters:           P           K          M0           e           ω 
                     1532.14695   140.23845     0.72411     0.48398     5.10110
instrument offsets:  (relative to Lick) 
                     ELODIE                HET                   HJS                  
                     15519.108             -98.130               -84.104              
vsys:  6.95947983
0 1 1

Warning

In this example, we used some properties of the data to assign priors for a few parameters. Some Bayesians might not agree with this and, in general, they would be right. However, note in the figures above for the original analysis how there's just no posterior mass across very very wide regions of the prior. Upon realizing this, we tried to come up with more informative priors. That means that, in essence, we just performed a prior sensitivity analysis, and concluded that the posterior estimates for the parameters were consistent.

Note also that this is very different from setting a prior for the orbital period based on a periodogram analysis, for example, which is almost never justified.