OCaml Probability Cache Library
This library provides a polymorphic cache for maintaining random events to their observed probabilities and expectations. There are two types of supported probability models- a set model and sequence model, either of which can be used in process (with containers) or over a distributed riak cache (with riakc_ppx).
All of these are installable via opam.
- containers (for in process cache)
- sequence (supplement to containers)
- ppx_deriving (for implementing show and comparable)
- ppx_deriving_protobuf (for distributed cache)
- riakc_ppx (for distributed cache)
- async (to interact with riak cache)
- oasis (for compiling)
To install simply run
opam install prob-cache
or you can pin this repository:
opam pin add prob-cache --dev-repo
The following will clone the repository, build locally, and then pin to opam:
git clone [email protected]:/struktured/ocaml-prob-cache.git
cd ocaml-prob-cache
oasis setup
make install
opam pin add prob-cache .
The API documentation is published on github pages.
(* A toy event model where it can be raining, a water sprinkler may be on, and the ground may be wet
* due to one, both, or none of these events *)
module Event = struct type t = IS_RAINING | SPRINKLER_ON | GROUND_IS_WET [@@deriving show, ord] end
module Model = Prob_cache_containers.Set_model.Make(Event)
open Event
let raining = Model.Events.of_list [IS_RAINING]
let sprinkler_on = Model.Events.of_list [SPRINKLER_ON]
let ground_wet_raining = Model.Events.of_list [IS_RAINING; GROUND_IS_WET]
let ground_wet_sprinkler_on = Model.Events.of_list [SPRINKLER_ON; GROUND_IS_WET]
let ground_wet = Model.Events.of_list [GROUND_IS_WET]
let ground_not_wet_sprinkler_on = Model.Events.of_list [SPRINKLER_ON]
let m = Model.create "toy-model" |>
Model.observe ground_wet_raining |>
Model.observe ground_wet_sprinkler_on |>
Model.observe ground_wet |>
Model.observe ground_not_wet_sprinkler_on
let ground_wet_given_raining = Model.prob ~cond:raining ground_wet m (* returns 1 *)
let ground_wet_given_sprinkler_on = Model.prob ~cond:sprinkler_on ground_wet m (* returns .5 *)
let ground_wet = Model.prob ground_wet m (* returns .75 *)
(* Models a sequence of coin flips, not necesssarily where each flip is independent *)
module Coin = struct type t = HEADS | TAILS [@@deriving show, ord] end
module Model = Prob_cache_containers.Sequence_model.Make(Coin)
open Coin
let events = Model.Events.of_list [HEADS;TAILS;HEADS;TAILS]
let heads = Model.Events.of_list [HEADS]
let tails = Model.Events.of_list [TAILS]
let heads_tails = Model.Events.of_list [HEADS;TAILS]
let m = Model.create "coin-flips" |>
Model.observe events
let cnt,exp = Model.count events m, Model.exp events m (* cnt=1, exp = 1. *)
let a = Model.prob heads m (* returns 1. *)
let b = Model.prob tails m (* returns 0. *)
let c = Model.prob heads_tails m (* return 1. *)
For a complete list of containers examples, see https://github.com/struktured/ocaml-prob-cache/tree/master/src/examples.
See https://github.com/struktured/ocaml-prob-cache/tree/master/src/riak_examples.
Both models are data driven, caching only what is observed, but are brute force in that they make no explicit attempt at efficient encodings or sparse representations.
Per observation, the set model is exponential with respect to the number of observed events. It stores 2^N instances per observation containing N events.
The sequence model is linear with respect to the sequence length. It stores N events for an observed sequence of length N.
- Open an issue on github
- Contact struktured on #ocaml freenode irc.
- Unit tests
- Lwt support
- Modules for other caching middlewares
- Sparser representations
- Distance metrics to estimate distributions on new observations
- Possible integration with https://github.com/agarwal/future
- GPU optimizations