Multiple instruments¶
It is quite typical to have RV observations from more than one instrument. These data should be analysed together for the detection of planets.
- Considering RVs from multiple instruments means adding new offset parameters to the model. By default, all these parameters share the same prior, but we'll see below how to change this.
- The same applies to the additional white noise (jitter). One jitter parameter is added per instrument, all sharing the same prior.
Newer versions of kima now allow for a stellar jitter parameter which is shared between all the instruments. By default, this parameter is fixed to zero to recover the old behaviour.
First of all, let's import the package
Detecting the planet around HD106252¶
First announced by Fischer et al. (2002), the giant planet orbiting HD106252 has been detected with a number of different instruments (Perrier et al. 2003, Butler et al. 2006, Wittenmyer et al.2009). In this example, we look at data from ELODIE, HET, HJS, and Lick.
Loading the datasets¶
Each of the datasets can be imported like this
and a simple visualisation of the RVs is provided by the plot method

The combined data is also readily available

Note the clear offset between ELODIE data (in blue above) and the other datasets. The average RV has been subtracted from the HET, HJS, and Lick data, but not from ELODIE. This slightly convoluted situation is ideal to demonstrate how kima deals with multiple instruments, especially with the default priors.
Simple model with default priors¶
We can now use one of the available models to analyse this dataset. We can either create the model
or import the multi_instruments function from the examples, which does the same
thing (but can also run the model directly)
Notice how we set the number of planets to be free, implicitly assigning to it
a uniform prior between 0 and 1 (by setting npmax=1).
This is all that is necessary to setup the data and model, and we are now ready to run the analysis. Notice that we didn't explicitly assign any priors (except for the number of planets). This means kima will use all the default priors.
Let's run the model for a few thousand steps and load the results
# Seeding random number generators. First seed = 1774786221.
# Generating 16 particles from the prior...done.
# Sampling...
# Creating level 1 with log likelihood = -41672699.602
# Creating level 2 with log likelihood = -22160193.243
# Creating level 3 with log likelihood = -12932226.1764
# Creating level 4 with log likelihood = -7102489.76827
# Creating level 5 with log likelihood = -4328483.50344
# Creating level 6 with log likelihood = -2837473.84961
# Creating level 7 with log likelihood = -1405544.29439
# Creating level 8 with log likelihood = -694001.107688
# Creating level 9 with log likelihood = -365390.575664
# Creating level 10 with log likelihood = -243424.077566
# Creating level 11 with log likelihood = -163717.581603
# Creating level 12 with log likelihood = -113113.945542
# Creating level 13 with log likelihood = -65736.8306782
# Creating level 14 with log likelihood = -45029.607034
# Saving particle to disk. N = 400.
# Creating level 15 with log likelihood = -30918.3294418
# Creating level 16 with log likelihood = -14790.9692475
# Creating level 17 with log likelihood = -8536.42767302
# Creating level 18 with log likelihood = -6591.00241089
# Creating level 19 with log likelihood = -5153.53941592
# Creating level 20 with log likelihood = -4261.88897482
# Creating level 21 with log likelihood = -3544.49215641
# Creating level 22 with log likelihood = -2918.62355424
# Creating level 23 with log likelihood = -2407.15783565
# Creating level 24 with log likelihood = -2011.67688725
# Saving particle to disk. N = 800.
# Creating level 25 with log likelihood = -1767.63022911
# Creating level 26 with log likelihood = -1516.64458251
# Creating level 27 with log likelihood = -1300.07516233
# Took 7.155 seconds
Loading files took 0.07 seconds
log(Z) = -972.14
Information = 30.17 nats
BMD = 0.00
Effective sample size = 1.0
Did we find HD106252 b?
Np probability ratios: []

Yes! In fact, even though the \(N_p\) parameter was free, all the samples have \(N_p=1\) because the posterior probability is so much higher than for \(N_p=0\). So we're pretty sure about this detection. How does the fit look?

The plot_random_samples function just shows the Keplerian curves from a few
random posterior samples, together with the data. Note that the RV offsets for
each instrument are not subtracted from the data. This is on purpose: since
we're showing several posterior samples (50 by default), there are actually
several values for the offsets. Zooming in on one particular dataset helps (note
the several blue curves):

Looking at the posterior for the orbital period, semi-amplitude, and eccentricity of the planet clearly shows that the default priors are a bit too wide
(10000,) (10000,) (10000,)

The situation is similar for the systemic velocity and for the RV offsets, as evidenced in the plots below, where we might not even see the posterior!

but they're there:

What we see here is that kima tries very hard to have default priors which will be appropriate for every RV dataset. However, this does sometimes mean that the priors are way too wide, which might hurt the performance of the sampler.
Informative priors¶
These issues can be easily solved by setting slightly more informative priors for the RV offsets and some of the orbital parameters. We'll use a couple of helper functions available in pykima.utils to assign appropriate Gaussian priors.
Gaussian(15580.5; 96.3933)
model_2.individual_offset_prior = get_gaussian_priors_individual_offsets(model_2.data, use_std=True)
print(model_2.individual_offset_prior)
[Gaussian(-15590.4; 59.0117), Gaussian(-15583.9; 61.2813), Gaussian(-15518.5; 102.095)]
and also set a narrower prior for the semi-amplitude
Let's run the model again using the new priors
# Seeding random number generators. First seed = 1774786237.
# Generating 4 particles from the prior...done.
# Sampling...
# level: 1 logL: -5272
level: 2 logL: -4219¶
level: 3 logL: -3803¶
level: 4 logL: -3340¶
level: 5 logL: -2635¶
level: 6 logL: -2251¶
level: 7 logL: -1859¶
level: 8 logL: -1639¶
level: 9 logL: -1428¶
level: 10 logL: -1230¶
level: 11 logL: -1044¶
level: 12 logL: -920.5¶
level: 13 logL: -870.5¶
level: 14 logL: -771.8¶
level: 15 logL: -721.9¶
level: 16 logL: -694.5¶
level: 17 logL: -678.7¶
level: 18 logL: -667.4¶
level: 19 logL: -656¶
level: 20 logL: -648.9¶
level: 21 logL: -643.3¶
level: 22 logL: -640.3¶
level: 23 logL: -635.6¶
level: 24 logL: -621.6¶
# Took 12.58 seconds
Loading files took 0.05 seconds
log(Z) = -614.91
Information = 26.06 nats
BMD = 0.01
Effective sample size = 1.0
The priors for the systemic velocity and RV offsets are still relatively wide but much more comparable to the posteriors

and for the orbital parameters
(10000,) (10000,) (10000,)

In any case, the orbital parameters of HD106252 b are well recovered. The maximum likelihood solution provides an excellent fit to the data.
Sample with the highest likelihood value (logL = -588.84)
-> might not be representative of the full posterior distribution
jitter:
[ 0. 39.14219623 13.01921112 129.15214067 53.54946682]
number of planets: 1
orbital parameters: P K M0 e w
1491.11205 70.21321 0.85408 0.04242 0.77033
instrument offsets: (relative to ELODIE)
HET HJS Lick
-15572.264 -15552.082 -15553.770
vsys: 15554.80614134

In this example, we used some properties of the data to assign priors for a few parameters. Some Bayesians might not agree with this and, in general, they would be right. However, note in the figures above for the original analysis how there's just no posterior mass across very very wide regions of the prior. Upon realizing this, we tried to come up with more informative priors. That means that, in essence, we just performed a prior sensitivity analysis, and concluded that the posterior estimates for the parameters were consistent.
Note also that this is very different from setting a prior for the orbital period based on a periodogram analysis, for example, which is almost never justified.