#### TL;DR

If you’re busy, do this

git clone --recursive https://github.com/j-faria/kima.git
cd kima
make -j 4
python setup.py install

cd examples/*CHOOSE ONE*
kima-run

kima-showresults all


For more details, read on.

#### Installation

kima is written in C++ and needs to be compiled, so you will need a working C++ compiler.
We run regular tests with the gcc and clang compilers:

To install, first clone the GitHub repository

 git clone --recursive https://github.com/j-faria/kima.git


don’t foget the --recursive to download the required submodules.

then go to the kima directory and compile the code (optionally, in parallel)

 cd kima
make  # -j 4


To analyse the results, kima comes with a helper Python package called pykima.
To install pykima, from inside the kima directory run

 python setup.py install  # or python setup.py develop


you may choose to analyse the results without pykima since the inputs and outputs of kima itself are simple text files. pykima will just make it easier.

If the installation fails, take a look at the troubleshooting section of the wiki or open an issue.

#### Getting started

##### Running the examples

kima comes packed with ready-to-run examples inside the kima/examples directory. All of them are described to some length here.

To run any example, go to the directory and use the kima-run script (part of pykima)

 cd examples/*CHOOSE ONE*
kima-run

##### Creating a template directory

After installing pykima, you can run the kima-template script to create a template directory containing the necessary files to run kima

 kima-template my-new-planet
# Populating directory my-new-planet
# with kima template files: kima_setup.cpp, OPTIONS, Makefile, README.md


You can now modify the kima_setup.cpp file to set up the model and run the analysis with kima-run as for the examples.

#### Building models

##### General settings

Every model in kima starts with a few general settings

const bool GP
const bool MA
const bool hyperpriors
const bool trend
const int degree
const bool multi_instrument
const bool known_object
const int n_known_object
const bool studentt


which always need to be defined.

yes, this is annoying, I’m working on a solution!

While some are self-explanatory, others can get be a bit tricky to understand. Here’s a short explanation for each one.

• GP
Whether to use a Gaussian process (GP) as a model for correlated noise in the RVs.

• MA
Whether to use a Moving Average as a model for correlated noise in the RVs.

the moving average model is still experimental

• hyperpriors
Whether to use hyper-priors for the orbital periods and for the semi-amplitudes of the planets. See the section on priors for more information.

• trend and degree
Consider a long-term trend in the RVs, of a given degree. This is a linear trend for degree=1, a quadratic for degree=2, and a cubic for degree=3. Higher degree trends are not implemented.

• multi_instrument
Whether the RV data comes from multiple instruments. See this example for more information.

a typical source of issues is loading multiple datasets with the load_multi function (see below) but forgetting to set multi_instrument = true.

• known_object and n_known_object
In known object (KO!) mode, kima considers some Keplerians as part of a “background” model, not included in the $N_p$ other planets. This can be useful when modeling transiting planets or simply planets with better-known orbital parameters.

• studentt
Whether to use a Student-t distribution as the likelihood, instead of a Gaussian. If this is set to true, the degrees of freedom $\nu$ of the Student-t distribution can be estimated from the data. This is sometimes useful when the RV data is suspected to contain (a few) strong outliers. See the example for more information.

##### Types of models

kima currently implements two kinds of models

• RVmodel
which models the RVs with a sum-of-Keplerians

• RVFWHMmodel
which models the RVs together with the FWHM as an activity indicator

The type of model to be used is set by defining a constructor like

RVmodel::RVmodel()
{
...
}


or

RVFWHMmodel::RVFWHMmodel()
{
...
}


a constructor in C++ is a special function that instantiates an object, in this case of type RVmodel or RVFWHMmodel. If this sounds weird, think of it as the __init__ function in a Python class. If this also sounds weird, just think of it as a simple function where we implement the model.

##### The $N_p$ planets

kima can model Keplerian functions, the RV signals induced by planets. But the distinguishing feature is that the code can actually model a number $N_p$ of Keplerians, and this number does not need to be fixed a priori. In other words, kima estimates the joint posterior distribution for $N_p$ and the orbital parameters of the planets $\theta$, given some data $\mathcal{D}$:

$p(N_p, \theta | \mathcal{D} )$

So $N_p$ is actually a free parameter in the model (if we want). It is somewhat of a special parameter, in that it only takes discrete values, but it’s otherwise similar to other parameters in the model. In particular, it needs a prior distribution as well.

This part of the model in kima can be set with a special pair of parameters for the model constructor

RVmodel::RVmodel() : fix(false) npmax(2)
{
...
}


which define whether of not $N_p$ is fixed and its value or prior distribution

$N_p \begin{cases} = \texttt{npmax}, & \text{if \texttt{fix}}\\ \sim \mathcal{U}(0, \texttt{npmax}), & \text{otherwise} \end{cases}$

Each of the $N_p$ planets has 5 orbital parameters

$\theta = { P, K, e, M_0, \omega }$

corresponding to the orbital period, semi-amplitude, eccentricity, mean anomaly at the epoch, and argument of periastron.

units
$P$ is in [days], $K$ is in [m/s], $e$ is unitless, and $M_0$ and $ω$ are in radians.

By default, the epoch is set to the first observed time, but it can be changed inside the constructor by setting the variable M0_epoch.

##### The “background” model

Besides the $N_p$ planets, kima models the RV observations as follows

• by default

$v_i \sim \mathcal{N} \left( v_{sys} \,,\: j^2+\sigma_i^2 \right)$

• if using a Student-t likelihood

$v_i \sim \mathcal{T} \left( v_{sys} \,,\: j^2+\sigma_i^2, \nu \right)$

• if using a Gaussian process

$v \sim \mathcal{GP} \left( \boldsymbol{\mu} = v_{sys} \,,\: \boldsymbol{\Sigma} = {\bf K} + (j^2+\sigma_i^2)\,\delta_{ij}\,{\bf I} \right)$

where $j$ is a jitter parameter that represents additional white noise variations not accounted for by the individual uncertainties.

When considering a long-term trend (by setting trend=true), instead of a constant systemic velocity, we have

\begin{aligned} v_{sys} \rightarrow v_{sys} & + slope \times (t-t_m) \qquad \text{if \texttt{degree}} \ge 1\\ & + quadr \times (t-t_m)^2 \quad\, \text{if \texttt{degree}} \ge 2\\ & + cubic \times (t-t_m)^3 \quad\:\: \text{if \texttt{degree}} = 3 \end{aligned}

where $t_m$ is the mean of the times.

##### The known object(s)

In known object (KO!) mode, kima considers some Keplerians as part of the “background” model. These are not included in the $N_p$ other planets and so their number is fixed.

For example, if we know of a transiting planet orbiting a given star, we can set

const bool known_object = true;
const int n_known_object = 1;


and use information derived from the transit in the priors (see below).

Let’s imagine we now set

RVmodel::RVmodel() : fix(false) npmax(3)


This would mean that our model asserts the presence of one Keplerian signal and we are trying to test up to three additional Keplerians. In other words, our inference on the posterior for $N_p$ is conditioned on the existence of the known object.

##### Setting the priors

As in any Bayesian analysis, kima needs a set of priors for the model parameters. If we don’t explicitly set the priors in the kima_setup.cpp file, a default prior will be used and the model will run. But sometimes (most times?) we will want to set custom priors for some parameters.

To change specific priors, we just need to re-define some variables inside the RVmodel constructor. The most important thing is that the priors for the planet orbital parameters require the conditional object, while the priors for the other parameters don’t.

For example, we can define two new priors

RVmodel::RVmodel() : fix(true), npmax(1)
{
// define new prior for the extra white noise parameter, j
Jprior = make_prior<Uniform>(1.0, 20.0); // in m/s

// get the conditional prior object
auto conditional = planets.get_conditional_prior();

// define new prior for the orbital period(s)
conditional->Pprior = make_prior<Gaussian>(15.0, 0.1); // in days
}


note the small difference in syntax when defining the two priors.

While it might seem counter-intuitive at first, in kima the priors for the orbital parameters are the same for all $N_p$ planets. This is intentional!
If this is not what you want, consider using the known object feature.

In the example above, the argument between < > in the calls to the make_prior function (called template arguments) specify the prior distribution and the arguments inside parentheses set the parameters of each distribution.

kima (and DNest4) currently implement the following distributions:

distribution arguments link
Uniform lower, upper wiki, scipy!
LogUniform lower, upper scipy
ModifiedLogUniform knee, upper
Rayleigh scale
Triangular lower, center, upper wiki, scipy
Laplace center, width wiki
Kumaraswamy alpha, beta wiki
Gaussian mean, stdev wiki, scipy
Exponential mean wiki
Cauchy location, scale wiki, scipy
Fixed value
TruncatedGaussian mean, stdev, lower, upper
TruncatedCauchy location, scale, lower, upper

If you need a distribution that is not yet implemented, consider opening an issue.

###### Checking if the priors are correct

For any kima model, it’s very easy to check if the priors are set up correctly. In the OPTIONS file, set the maximum number of levels to 1, and run the model as usual. With this setting, DNest4 will sample directly from the priors and ignore the likelihood.

The kima-checkpriors script is provided with the pykima package to make histograms of the samples of each parameter. Simply call it with the column or the name of the parameter you want to check (look inside sample.txt to see which one is which)

For example

kima-checkpriors 1


might print

Histogram of column 1: jitter
number of samples: 10000
max value: 1999.744937
min value: 0.000128


and display the histogram of the prior samples:

###### Default priors

Below is the list of default priors which are used if not explicitly re-defined in the model constructor.

expand
• orbital periods and semi-amplitudes
units: days and m/s, respectively

// if hyperpriors = false (default)
Pprior = make_shared<LogUniform>(1.0, 1e5);
Kprior = make_shared<ModifiedLogUniform>(1.0, 1e3);

// if hyperpriors = true
Pprior = make_shared<Laplace>(exp(log_muP), wP);
Kprior = make_shared<Exponential>(exp(log_muK));

• orbital eccentricities
eprior = make_shared<Uniform>(0, 1);


another good option could be

eprior = make_shared<Kumaraswamy>(0.867, 3.03);


which approximates the Beta distribution proposed by Kipping (2013).

• orbital phase and argument of periastron
units: radians

phiprior = make_shared<Uniform>(0, 2*PI);

wprior = make_shared<Uniform>(0, 2*PI);

• systemic velocity
units: m/s

Cprior = make_prior<Uniform>(data.get_RV_min(), data.get_RV_max());

• additional white noise, or jitter
units: m/s

knee = min(1.0, 0.1*data.get_max_RV_span())
upper = data.get_max_RV_span()

Jprior = make_prior<ModifiedLogUniform>(knee, upper);

• ccoefficients of the trend (if trend=true)
units: m/s/day, m/s/day², m/s/day³
  slope_prior = make_prior<Gaussian>(0.0, pow(10, data.get_trend_magnitude(1)));
quadr_prior = make_prior<Gaussian>(0.0, pow(10, data.get_trend_magnitude(2)));
cubic_prior = make_prior<Gaussian>(0.0, pow(10, data.get_trend_magnitude(3)));

• between-instrument offsets units: m/s

lower = -data.get_RV_span();
upper = data.get_RV_span();
offsets_prior = make_prior<Uniform>(lower, upper);

• GP hyperparameters
units: $\eta_1$ in m/s, $\eta_2$ and $\eta_3$ in days

log_eta1_prior = make_prior<Uniform>(-5, 5);

eta2_prior = make_prior<LogUniform>(1, 100);

eta3_prior = make_prior<Uniform>(10, 40);

log_eta4_prior = make_prior<Uniform>(-1, 1);

• degrees of freedom for Student-t likelihood

nu_prior = make_prior<LogUniform>(2, 1000);

• moving average parameters

sigmaMA_prior = make_prior<Uniform>(-1, 1);
tauMA_prior = make_prior<LogUniform>(1, 100);

• hyperparameters for orbital period and semi-amplitude hierachical priors
only if hyperpriors=true

log_muP_prior = make_prior<TruncatedCauchy>(log(365), 1.0, log(365)-21, log(365)+21);

wP_prior = make_prior<Uniform>(0.1, 3.0);

log_muK_prior = make_prior<TruncatedCauchy>(0.0, 1.0, -21.0, 21.0);

• correlation coefficients with activity indicators

betaprior = make_prior<Gaussian>(0, 1);