Nix

Nix is not a configuration management tool alike Puppet, Chef or Salt. It is more accurately described as a (universal) package manager.

In that regard, unless you are running nixos, disnix (or use other tricks), nix won’t configure systemd services for instance. Nix only operates on its store (usually located in '/nix') to gather packages called derivations in nix parlance.

Nix is a radical rethink of the distribution model. It offers:

  • best possible build reproducibility

  • self-contained environments

  • easy rollback

  • composability of derivation

NixOS

Install

  1. Switch to azerty keyboard

    → loadkeys be-latin1
  2. Partition with gdisk (efi) or fdisk (noefi)

    using virtualbox you don’t want/need efi

    (g/f)disk /dev/sda

    Create 2 partitions sda1(83 default)/sda2(82).

    [efi] Create an extra (boot) partition with type EF00.

  3. Create file system

    → mkfs.ext4 -L nixos /dev/sda1
    → mkswap -L swap /dev/sda2

    [efi] Choose vfat.

  4. Mount it

    → mount /dev/disk/by-label/nixos /mnt

    [efi] mkdir /mnt/boot and mount the boot partition in.

  5. Generate a default config

    → nixos-generate-config --root /mnt
  6. Minimal edit the config; don’t forget to uncomment the option 'boot.loader.grub.device'

    → vim /mnt/etc/nixos/configuration.nix

    [efi] No edit required.

  7. Install

    → nixos-install
  8. Reboot

    → reboot
  9. Upgrade

    → nixos-rebuild boot --upgrade
    → reboot
  10. Configuration

Configuration

Some nix properties are set up in /etc/nix/nix.conf

for wifi, manually configure it by using NetworkManager through the nmtui text interface

/etc/nixos/configuration.nix
  nixpkgs.config.allowUnfree = true;

  i18n = {
    consoleFont = "Lat2-Terminus16";
    consoleKeyMap = "be-latin1";
    defaultLocale = "en_US.UTF-8";
  } ;

  environment.systemPackages = with pkgs; [
    asciidoctor (1)
  ];

  # Define a user account. Don't forget to set a password with ‘passwd’.
  users.extraUsers.nix = { (2)
    createHome = true;
    home = "/home/nix";
    isSystemUser = true;
    extraGroups = [ "wheel" "disk" "vboxusers" "docker"];
    shell = "/run/current-system/sw/bin/bash";
    uid = 1000;
  };

  programs.bash.enableCompletion = true;
  security.sudo.wheelNeedsPassword = false;

  fonts = {
    enableFontDir = true;
    fonts = [ pkgs.source-code-pro ];
  };

  nix.extraOptions = ''
    gc-keep-outputs = true (3)
    gc-keep-derivations = true (3)
  '';

  virtualisation.docker.enable = true;
  virtualisation.docker.extraOptions = "--insecure-registry x.lan --insecure-registry y.lan";

  virtualisation.virtualbox.guest.enable = true; (4)
  boot.initrd.checkJournalingFS = false; (4)
1 add packages
2 do create a new user !
(root won’t be able to have a chromium session by default)
3 prevent too much gc in developer environment
4 virtualbox only

System management

Update
→ sudo nixos-rebuild switch
→ sudo nixos-rebuild boot --upgrade (1)
1 safer to use boot when upgrading

Derivation

Nix produces build product by following a two steps phase:

Nix expression    (evaluation) →    Derivation    (realisation) →    Build product

The first evaluation step is pure. The produced drv file acts as an intermediate specification for a build that can be freely redistribute to a set of machines.

Derivations are stored in the nix store as follows: /nix/store/hash-name, where the hash uniquely identifies the derivation (not true, it’s a little more complex than this), and name is the name of the derivation.

From a nix language point of view, a derivation is simply a set, with some attributes.

To build a package, nixpkgs makes heavy usage of stdenv and its function mkDerivation:

stdenv.mkDerivation rec {
  name = "libfoo-${version}"; (1)
  version = "1.2.3"
  src = fetchurl {
    url = http://example.org/libfoo-1.2.3.tar.bz2;
    md5 = "e1ec107956b6ddcb0b8b0679367e9ac9"; (2)
  };
  builder = ./builder.sh; (3)
  buildInputs = [ruby]; (4)
}
1 mandatory name attr
2 mandatory checksum for remote source
3 if not provided, the generic builder is used
4 additional input required to build the derivation[1]

The output of a derivation needs to be deterministic. That’s why you can fetch source remotely iff you know the hash beforehand.

runtime dependencies

derivation never specifies runtime dependencies. These are automatically computed by Nix. You can print them with:

nix-store -q --tree $(nix-store -qd $(which cabal2nix))
overrideDerivation drv f

takes a derivation and returns a new derivation in which the attributes of the original are overriden according to the function f. Most of the time, you should prefer overrideAttrs.

Channels

A channel is the Nix mechanism for distributing a consistent set of Nix expressions and binaries. nix-channel --add

→ nix-channel --add http://nixos.org/channels/nixpkgs-unstable
→ nix-channel --update
→ nixos-rebuild switch

The unstable channel is usually a few days older from nixpkgs master. For a precise status, check here.

You can directly use a derivation from master. For instance, after cloning nixpkgs, you could type:

NIX_PATH=nixpkgs=/home/vagrant/projects/nix/nixpkgs nix-env -f '<nixpkgs>' -iA haskellPackages.stack
  • In future version of nix, channel might be deprecated to favor NIX_PATH solely.

  • On nixos, you should stick to nixos-unstable (don’t use nixpkgs-unstable because specific nixos sanity check won’t applied)

Nix-shell

When Nix builds a package, it builds it in an isolated environment. It does this by creating a clean, child shell, then adding only the dependencies the package declares. After setting up the dependencies, it runs the build script, moves the built app into the Nix store, and sets up the environment to point to it. Finally, it destroys this child shell.

But we can ask Nix to not destroy the child shell, and instead let us use it for working iteratively on the app. This is what the nix-shell is about: it will build the dependencies of the specified derivation, but not the derivation itself.

 nix-shell '<nixpkgs>' -p ruby haskellPackages.stack (1)
1 p and -A are mutually exclusive

If a path is not given, nix-shell defaults to shell.nix if it exists, and default.nix otherwise.[2]

This allows for a nice trick. We can decribe a virtual dev environment (of any sort for any language) by decribing a derivation in default.nix like so:

default.nix
with import <nixpkgs> {};

let henv = haskellPackages.ghcWithPackages (p: with p; [shake]);

in
stdenv.mkDerivation {
  name = "haskell-env";
  buildInputs = [ henv pythonPackages.pyyaml];
}

nix-shell will use the NIX_PATH environment variable which by default in user space points to the root nixpkgs channel. That means that (unlike nix-env), even if your channel points to unstable in user space, nix-shell might still use the root stable channel. You can change that behavior by running for instance:

nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs-channels/archive/nixos-unstable.tar.gz

You can force any script file to run in a nix-shell as such:

#! /usr/bin/env nix-shell
#! nix-shell -i bash

or without a default.nix file:

#! /usr/bin/env nix-shell
#! nix-shell --pure
#! nix-shell -p asciidoctor -p pythonPackages.pygments
#! nix-shell -p "haskellPackages.ghcWithPackages(p: with p; [shake])" (1)
#! nix-shell -i bash
#! /usr/bin/env nix-shell
1 Double quotes are required. Don’t add -p ghc as you will end up with two different ghcs !

In Haskell, we need the --attr env to tell nix-shell to compute the isolated development environment:

shell.nix
with (import <nixpkgs> {}).pkgs;
(haskellPackages.callPackage ./. {}).env (1)
1 callPackage will use the current defined scope to pass matched arguments

default.nix is then generated by cabal2nix to describe how to nix-build the haskell package.

Nix-env

nix-env is the command to use to search, install, remove packages locally in user space (or profile). These packages are installed in the nix-store but are only accessible inside one environment (aka user/profile).

nix-env doesn’t require a starting nix expression. As a consequence, nix-env does not use <nixpkgs> as NIX_PATH. It actually uses ~/.nix-defexpr/channels.
If you want to use <nixpkgs>, you would explicitly use the -f (or --file) option on the command line.

  • -q list installed derivations within a profile

  • -qaP list available package with the path

When searching for packages, it is usually more efficient to specify a namespace attribute using the -A option.

# in nixos:
→ nix-env -qaP -A nixos.haskellPackages
→ nix-env -qaP -A nixos.pythonPackages
# outside nixos:
→ nix-env -qaP -A nixpkgs.pythonPackages

You can also omit the channel namespace and specify the input for nixpkgs explicitly with the -f option:

→ nix-env -f '<nixpkgs>' -qaP -A haskellPackages.shake --description
  • -i install derivations

    → nix-env -f '<nixpkgs>' -iA pythonPackages.pyyaml (1)
    → nix-env -f '<nixpkgs>' -i brackets -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/master.tar.gz’ (2)
    1 on nixos, you might use nix-env -iA nixos.pythonPackages.pyyaml
    2 install from master directly
  • -e erase

    → nix-env -e python2.7-PyYAML-3.11
  • -u update

    → nix-env -u

Nix-build

nix-build tool does two main jobs:

  • nix-instantiate: parse the .nix file and return the .drv file (the evaluation step)

  • nix-store -r: realise the build product from the input .drv derivation

nix-pull is deprecated and replaced by the use of binary caches

Language Expressions

String
let
  h = "Hello";
  value = 4;
in
{
  helloWorld = "${h} ${toString value} the win!"; (1)
}
1 interpolation of the toString builtin function to convert an int value
List
[ 123 ./foo.nix "abc" (f { x = y; }) ]
Attribute Set
let x = 12;
    y = 34;
    f = {n}: 5 + n;
in
rec {
  r = { inherit x y; (1)
    text = "Hello";
    add = f { n = 56; }; (2)
  };
  sum = r.add + r.y;
  hello = r.text or "World"; (3)
  b = r ? x; (4)
}
1 when defining a set it is often convenient to copy variables from the surrounding lexical scope
2 all ; are mandatory
3 Sets accessor using .
Default using or
4 does the record 'r' contains an attribute 'x' ?
Function
pattern: body
# `min` and `max` are available in stdenv.lib
min = x: y: if x < y then x else y; (1)
1 pattern is a func returning a func (2 arguments)
{stdenv, fetchurl, perl, ... }: (1)

  stdenv.mkDerivation { (2)
    name = "hello-2.1.1";
	...
  };
1 pattern is a set of arguments
the 'ellipsis' (…​) allows the passing of a bigger set, one that contains more than the 3 required arguments.
2 function call passing a set as argument
Common functions
listToAttrs (1)
  [ { name = "foo"; value = 123; }
    { name = "bar"; value = 456; }
  ]
1 alike fromList from Haskell except there is no tuple type in Nix
With
with e1; e2

Introduces all attributes of the set e1 into the lexical scope of the expression e2:

let as = { x = "foo"; y = "bar"; };
in
foobar = with as; x + y
Optional argument
{ x, y ? "foo", z ? "bar" }: z + y + x (1)
1 a function that only requires an attribute named x, but optionally accepts y and z.
Merge sets
e1 // e2 # merge e1 and e2 with e2 taking precedence in case of equally named attribute
Logical implication
e1 -> e2 (1)
1 if e1 is false, return true else check that e2 is true otherwise return false (in order word return e2). Useful with assert

Nix modules

A NixOS module is a file that handles one logical aspect of the configuration.

{ config, lib, pkgs, ... }: (1)

{
  imports = (2)
    [
    ];

   options.services.foo = { (3)
    enable = mkOption {
      type = types.bool;
      default = false;
      description = ''

      '';
    };
    ...
  };

  config = mkIf config.services.foo.enable { (4)
    environment.systemPackages = [ ... ];
  };
}
1 function declaration with access to the full system configuration and nixpkgs
2 paths to other modules that should be included in the evaluation
3 options declaration
4 option definition

Tips and tricks

Customize nixpkgs locally

You can override derivation attributes in user space without forking the nixpkgs repository. In ~/.nixpkgs/config.nix you typically declare a packageOverrides function and then use override to customize attributes:

~/.nixpkgs/config.nix
{
  packageOverrides = super: (1)
    let self = super.pkgs;
        foo = super.foo.override { barSupport = true ; }; (2)
    in
    {
      inherit foo;
      haskellPackages = super.haskellPackages.override {
        overrides = self: super: { (3)
          language-puppet_1_3_3 = self.callPackage ./pkgs/language-puppet {inherit foo;}; (4)
        };
    };
}
1 packageOverrides takes the original (super) nixpkgs set and return a new (self) record set. [3]
2 call override (defined on much derivations) to changes the arguments passed to it.
3 override the overrides attribute of haskellPackages
4 key = value of the return set
override/fix pattern
fix = f:
  let self = f self;
  in self;

extend = attrs: f: self:
  let super = attrs self;
  in super // f self super;

ps = self:
  { foo = "foo"; bar = "bar";
     foobar = self.foo + self.bar;
  };

f = self: super:
  { foo = reverse super.foo; }

(fix ps).foobar # "foobar"

(fix (extend ps f)).foobar # "oofbar"
Overlays

Since 17.03 there is a more idiomatic way to achieve such local customization:

~/.config/nixpgks/overlays/default.nix
self: super:
let
  hlib = super.haskell.lib;
in
{
  haskellPackages = super.haskellPackages.override {
    overrides =  hpkgs: _hpkgs: {
      cicd-shell = hlib.dontCheck (hlib.dontHaddock
        (_hpkgs.callCabal2nix "cicd-shell" (super.fetchgit { (1)
           url = "http://stash.cirb.lan/scm/cicd/cicd-shell.git";
           rev = "d76c532d69e4d01bdaf2c716533d9557371c28ea";
           sha256 = "0yval6k6rliw1q79ikj6xxnfz17wdlnjz1428qbv8yfl8692p13h";
         }) {
              protolude = _hpkgs.protolude_0_2;
            }
        ));
      };
    };
}
1 callCabal2nix allows to automatically fetch and build any haskell package from the web
Overrides haskell packages for the ghc821 compiler
self: super:
let
  hlib = super.haskell.lib;
in
{
haskell = super.haskell // { packages = super.haskell.packages // { ghc821 = super.haskell.packages.ghc821.override { (1)
   overrides =  hpkgs: _hpkgs: {
     containers = hlib.dontCheck(_hpkgs.containers);
   };
};};};
}
1 haskell equals super.haskell except packages, which equals super.haskell.packages except for ghc821, which is the overriden version of super 821
Private packages

You can also extend nixpkgs with private derivations without any forking. For instance using a custom file:

dotfiles.nix
with import <nixpkgs> {}; (1)

let xmonadEnv = haskellPackages.ghcWithPackages (p: with p; [xmonad xmonad-contrib]); (2)
in

stdenv.mkDerivation {
  name = "devbox_dotfiles-0.1";

  src = fetchFromGitHub {
    owner = "CIRB";
    repo = "devbox-dotfiles";
    rev = "801f66f3c7d657f5648963c60e89743d85133b1a" ;
    sha256 = "1w4vaqp21dmdd1m5akmzq4c3alabyn0mp94s6lqzzp1qpla0sdx0" ;
  };

  buildInputs = [ xmonadEnv ];

  installPhase = ''
    ${xmonadEnv}/bin/ghc --make .xmonad/xmonad.hs -o .xmonad/xmonad-x86_64-linux (3)
    cp -R ./. $out (4)
  '';

  meta = {
    description = "Dot files for the devbox";
  };
}
1 dependencies provided by nixpkgs using $NIX_PATH
2 ghc with module deps included
3 at this stage, the shell is inside a temp dir with the src included
4 copy the content of the current dir into $out

You then build the derivation or install it in the user environment.

→ nix-build dotfiles.nix
→ nix-env -f dotfiles.nix -i devbox_dotfiles (1)
1 nix-env -i takes the name attribute and strip the version (first numeric after -)
Pinned a version of nixpkgs
let
  nixpkgs = builtins.fromJSON (builtins.readFile ./.nixpkgs.json);
in
import (fetchTarball {
  url = "https://github.com/NixOS/nixpkgs/archive/${nixpkgs.rev}.tar.gz";
  inherit (nixpkgs) sha256;
})

Updating .nixpkgs.json is realized with such a zsh function:

function updateNixpkgs () {
    nix-prefetch-git https://github.com/NixOS/nixpkgs.git "$1" > ~/.config/nixpkgs/.nixpkgs.json
}
Caching the list of all available package into a local file
nix-env -qaP --description '*' > ~/allpkgs.desc
Reproduce any hydra build locally
bash <(curl https://hydra.nixos.org/build/57055021/reproduce)

Bootstrap

Nix composes all of these individual functions into a large package repository. This repository essentially calls every single top level function, with support for recursive bindings in order to satisfy dependencies. Continuing with the hello example, we may have a top-level entry point like:

rec {
  hello = import /path/to/hello.nix { inherit stdenv fetchurl; }; (1)

  stdenv = import /path/to/stdenv.nix { inherit gcc };

  fetchurl = import /path/to ;

  gcc = import /path/to/gcc.nix {};

  # ...
}
1 Import loads a file containing a function and then calls that function with the provided arguments

But wait - I just said this calls all functions… so wouldn’t that then mean that all software gets installed? The trick here is that Nix is a lazy language.

Ruby

  • Create or copy a Gemfile at the root dir of the project

  • Create a default.nix file :


{ bundlerEnv }:

bundlerEnv rec {
  name = "xxx-${version}";
  version = "4.10.11";
  gemdir = ./.;
}
  • Use bundix in the target directory:

$(nix-build '<nixpkgs>' -A bundix --no-out-link)/bin/bundix --magic (1)
1 magic lock,pack and write dependencies It will create both a gimset.nix file and a Gemfile.lock
  • Create a default.nix file

Haskell

Concepts

Type class

Type classes are in a sense dual to type declarations. Whereas the latter defines how types are created, type class defines how a set of types are consumed.

When talking about polymorphism, type class enables a form of adhoc polymorphism or overloading [4] that needs to be delimited as such to play well with parametric polymorphism and keeping the type checking sane.

Type class are not first class in Haskell. They cannot be used in place of type (as you would in Java with interface).

It is internally implemented as dictionnary passing: ghc puts the methods of the instance in a dictionary and passes that implicitly to any functions having a class constraint.

It is best to look at them as a set of constraints on type. One notable drawback is that each type can have at most one implementation of the type class.

Eq, Show, Num, Integral, Ord, Enum are classical examples.

class Num a where
  (+) :: a -> a -> a
  (*) :: a -> a -> a
  (-) :: a -> a -> a
  negate :: a -> a
  abs :: a -> a
  signum :: a -> a
  fromInteger :: Integer -> a

Using enumFromTo from the Enum type class:

→ enumFromTo 3 8     -> [3,4,5,6,7,8]
→ enumFromTo 'a' 'f' -> "abcdef"

In Scala, type-classes are types themselves, and instances are first class values.

Type Family

data Nat = Zero | Succ Nat

-- Add is a type which is a function on types
type family Add (x :: Nat) (y :: Nat) :: Nat
-- Then comes the implementation of the (type) function
type instance Add Zero     y = y
type instance Add (Succ x) y = Succ (Add x y)

Typeable

The Typeable class is used to create runtime type information for arbitrary types:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Typeable

data Animal = Cat | Dog deriving Typeable
data Zoo a = Zoo [a] deriving Typeable

example :: TypeRep (1)
example = typeRep (Zoo [Cat, Dog]) (2)
-- Zoo Animal
1 Runtime representation of the type of the value
2 typeRep correspond to typeOf which is kept for backwards-compatibility
class Typeable a where
  typeRep :: Proxy a -> TypeRep (1)
1 take a type (Proxy) that it never look at

Typeable is actually as old as Haskell (before it was even called Haskell …​)

Ref/State Primitives

MVars

concurrency primitive, designed for access from multiple threads. It is a box which can be full or empty. If a thread tries to read a value from an empty MVar, it will block until the MVar gets filled (by another thread). Same with full and takeMVar.

IVar

Immutable variable you are only allowed to write to it once.

STM

Retry aborts the transaction and retry it whenever the TVar gets modified.

IORef

Just a reference to some data, a cell. Operate in IO. You can think of it like a database, file, or other external data store. atomicModifyIORef uses CAS (compare and swap implemented at the hardware level) to guarantee the atomicity of read-modify-write kind of operations.

Functor

A functor is a structure-preserving mapping (or homomorphism) between 2 categories.

This means that :

  • for an object A in one category, there is a corresponding object F A in the second one.

  • for a morphism (A → B), there is the corresponding F A → F B

In Haskell, the objects are types and the mappings are functions. Type constructors (* → *) are used to map types into types.

class Functor f where
	fmap :: (a -> b) -> f a -> f b

The functor defines the action of an arbitrary function (a → b) on a structure (f a) of elements of type a resulting in the same structure but full of elements of type b.

Laws:
fmap id = id

fmap (g . h) = fmap g . fmap h
Example:
instance Functor ((->) r) where
  fmap f g = f . g -- or fmap = (.)

Another intuition is to look at functors as producers of output that can have its type adapted. So Maybe a represents an output of type a that might be present (Just a) or absent (Nothing). fmap f allows us to adapt the output of type a to an output of type b.

Whenever you have producer of outputs, you might also have the dual consumer of inputs. This is where Contravariant comes in. The intuition behind a Contravariant is that it reflects a sort of "consumer of input" that can have the type of accepted input adapted.

class Contravariant f where
  contramap :: (b -> a) -> f a -> f b

So here we can adapt the input to go from a consumer of input 'a' to a consumer of input 'b'. But to go there you need to provide a function from 'b' to 'a'

Isomorphisms

Category theory allows us to give a precise, abstract (works for all categories) and self-contained definition of an isomorphism:

An arrow/morphism f: A → B is called an isomorphism in C if there is an arrow g that goes from B to A such that:
g ∘ f = 1A and f ∘ g = 1B

Applicative

With a functor f it is not possible to apply a function wrapped by the structure f to a value wrapped by f. This is given by Applicative:

class Functor f => Applicative f where
  pure :: a -> f a
 (<*>) :: f (a -> b) -> f a -> f b

<*> is just function application within a computational context.

As soon as you want to define the type (a → b → c) → f a → f b → f c you need the applicative construction:

liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2 f a b = fmap f a <*> b

It is not that hard to convince yourself that an applicative functor is just a functor that knows how to lift functions of arbitrary arities.

Law
fmap g x = pure g <*> x

Applicative functors are to be preferred to monads when the structure of a computation is fixed a priori. That makes it possible to perform certain kinds of static analysis on applicative values.

Alternative

An Alternative instance gives an applicative functor the structure of a monoid, with empty as the unit element, and <|> as the binary operation.

class Applicative f => Alternative f where
  empty :: f a
 (<|>) :: f a -> f a -> f a
asum

give you the first successful computation or the last zero value. With failures, it really disregards them striving for success. It is defined as:

asum = foldr (<|>) empty
→ asum [Just 1, Just 2, Nothing]              -> Just 1
→ asum [Left "Failing", Right()]              -> Right ()
→ asum [Left "Failing", Left "Failing again"] -> Left "Failing again"

Note that some monad such as ExceptT are appending (using the monoid instance) the error messages (the Monoid m ⇒ Left m) when using asum or msum.

MonadPlus together with mzero, mplus and msum are the monadic equivalents. Since 7.10, all MonadPlus are Alternative (likewise all monads are applicatives). so you whould avoid using these and prefer empty, (<|>) and asum.

Monad

class Applicative m => Monad m where
  join :: m (m a) -> m a

(>>=) :: m a -> (a -> m b) -> m b (1)
1 The signature of bind allows the second computation to depend on the value of the first one.

Monadic values are produced in a context. Monads provide both substitution (fmap) and renormalization (join).

m >>= f = join (fmap f m)

Even if a monad is strictly more powerful than an Applicative, there are situations for which an applicative is the only valid choice. Indeed <*> lets you explore both arguments by pattern matching but with ap the right hand side cannot be evaluated without the result from the left.

As a stretch while applicative allows for parallelism, monad allows for sequencing.

A monad is like a monoid where we combine functors "vertically". join is analogous to (+) and return to 0.

By law >> = *>. Consequently mapM_ = traverse_.
  • Side-Effect

  • Environment

  • Error

  • Indeterminism

Free

A free construction is a real instance of that construction that hold no extra property. It is the least special possible instance. A free monad is just substitution (fmap) with the minimum amount of renormalization needed to pass the monad laws.

It is perfect to separate syntax (data, ast, parsing) from semantics (interpretation)

The free monad is guaranteed to be the formulation that gives you the most flexibility how to interpret it, since it is purely syntactic.

data Free f a = Pure a | Free (f (Free f a))

The fixed point of a function is generally just the repeated application of that function: fix f = f (f (f (f (f (f (f (f (f (f (f (f (f …​ )))))))))))) or fix f = f (fix f)

A Monad n is a free Monad for f if every Monad homomorphism from n to another monad m is equivalent to a natural transformation from f to m.

Existential classes

When someone defines a universal type ∀X they’re saying: you can plug in whatever type you want, I don’t need to know anything about the type to do my job, I’ll only refer to it opaquely as X.

When someone defines an existential type ∃X they’re saying: I’ll use whatever type I want here; you won’t know anything about the type, so you can only refer to it opaquely as X

ByteString

  • Word8 is Haskell’s standard representation of a byte

  • ByeString character functions (Data.ByteString.Char8) only work with ASCII text, hence the Char8 in the package name → if you are working with unicode, you should use the Text package

  • In general we use strict bytestring when you have control about the message. Lazy bytestring is a bit more flexible and used for streaming.

Lazyness

Reduction is done using outermost reduction. For instance:

loop = tail loop

fst (1, loop)
-- innermost reduction gives:
-- fst (1, (tail loop))
-- fst (1, (tail (tail loop))) and never terminates
-- but outermost reduction gives:
-- fst (1, loop) = 1 and terminates
Redex
-- only one redex (2*3) both innermost and outermost
1 + (2 * 3)

-- 2 redexes :
-- (\x -> 1 + x ) (2 * 3) outermost
-- (2 * 3) innermost
(\x -> 1 + x ) (2 * 3)

Mind blowing

instance Monoid r => Monoid (Managed r) where
    mempty = pure mempty
    mappend = liftA2 mappend
xs = 1 : [x + 1 | x <- xs] --> [1,2,3 ...]
Right cfg -> return . Right . query cfg fp =<< F.newFileCache

UI

  • HsQML (qt 5)

  • SDL2/gl for game

  • Web (ghcjs, threepenny, …​)

Pitfall

(++) needs to reconstruct the list on the left !

# ! inefficient !
→ [1..10000] ++ [4]

Useful

-fdefer-type-errors

Fold and Traverse

Introduction

Folding is the act of reducing a structure to a single value.

We can see them as consumers or comonads.[5]

foldl/foldr

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)

foldl :: (s -> a -> s) -> s -> [a] -> s
foldl  f v []     =  v
foldl  f v (x:xs) =  foldl f (f v x) xs

Both functions take 3 arguments: - a combining function 'f' - a default value 'v' - the data to be folded.

The default value 'v' deals with the empty element of the list []. For non empty list, foldl1/foldr1 is best suited.

foldr

You can think of foldr non-recursively as simultaneously replacing each (:) in a list by a given function, and [] by a given value:

foldr (-) 0 (1:2:3:[]) = (1 - ( 2 - (3 - 0))) = (1 - (2 - 3) = (1 - ( -1 ) = 2
Foldr

Foldr is handy if f is not strict in both arguments. That way we can rely on laziness to stop the recursion (or build an infinite list). Map for instance has to use foldr to maintain its laziness capabilities:

map = foldr (\x ys -> f x : ys) []
-- or map' = foldr ((:) . f) []

-- of course it can also be defined with recursion only
map' :: (a -> b) -> [a] -> [b]
map' _ [] = []
map' f (x:xs) = f x : (map f xs)

-- ex
takeWhile (< 12) $ map (*2) [1..]
foldl

On the other hand, when the whole list needs to be traversed (sum or reverse are two examples), foldl' is actually more efficient in term of memory space.

foldl (-) 0 (1:2:3:[]) = (0 - 1) - 2 - 3 = -6

reverse = foldl' (flip (:)) []

The strict version List.foldl' should always be used instead of the foldl from Prelude. The Foldable type class comes with foldl defined strictly.

Foldl

Foldl package

To get a better representation for fold we need to transform the function into data.

{-# LANGUAGE ExistentialQuantification #-}
-- existential datatype (note that `x` does not appear on the left side)
data Fold a b
--                    step func      initial acc     extract func (done)
 = forall x . Fold  (x -> a -> x)       x             (x -> b)

-- expressed as a GADT it would be:
data Fold a b where
  Fold :: (r -> b) -> (r -> a -> r) -> r -> Fold a b

Fold is a functor, a monoid and an applicative. It is also a profunctor and a comonad. It is actually isomorphic to a Moore machine (see https://www.fpcomplete.com/school/to-infinity-and-beyond/pick-of-the-week/part-2)

-- | Apply a strict left 'Fold' to a 'Foldable' container
fold :: Foldable f => Fold a b -> f a -> b
fold (Fold step begin done) as = F.foldr cons done as begin
  where
    cons a k x = k $! step x a

This makes it possible to define cleanly the function average without traversing twice the foldable container.

average = (/) <$> sum <*> genericLength

sum :: Num a => Fold a a
sum = Fold (+) 0 id

genericLength :: Num b => Fold a b
genericLength = Fold (\n _ -> n + 1) 0 id

λ> fold average [1..10000000]
Fold is also a profunctor and a comonad.
Alternative monoid definition

As explained in Gabriel’s beautiful fold talk, Fold can similarly be defined as

data Fold i o = forall m . Monoid m => Fold (i -> m) (m -> o)

This approach can express parallel computation but it won’t encode stateful folds.

FoldM
data FoldM m a b =
  -- | @FoldM @ @ step @ @ initial @ @ extract@
  forall x . FoldM (x -> a -> m x) (m x) (x -> m b)

Fold is equivalent to FoldM Identity. You use generalize (with no performance penalty) to get a FoldM from a Fold:

generalize :: Monad m => Fold a b -> FoldM m a b
In the turtle library, FoldM plays the role of a consumer and Shell the role of a Producer. fold is how you connect them together.

Foldable/Traversable

Fold

fold from the foldl take as an argument any Foldable structure. Foldable are structures that we can reduce into a single result.

class Foldable t where
  fold    :: Monoid m => t m -> m
  foldMap :: Monoid m => (a -> m) -> t a -> m
  foldMap g = mconcat . map g

λ> foldMap Sum [1,2,3,4]
Sum {getSum = 10}
fold and foldMap require the elements of the Foldable to be monoids.

In Data.Foldable, mapM is defined with foldr (which is kind of mind blowing)

mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
mapM_ f = foldr ((>>) . f) (return ())
Traversable

When you traverse a structure you actually want to keep it intact. The function traverse is exactly mapM generalized for all Foldable`s. Traversable applies any applicative effect; traverse is an "effectful" fmap.

class (Functor t, Foldable t) => Traversable t where
  traverse  :: Applicative f => (a -> f b) -> t a -> f (t b)
  traverse f = sequenceA . fmap f
  mapM = traverse
  sequenceA :: Applicative f => t (f a) -> f (t a)
  sequenceA = traverse id

for = flip traverse

Lenses

Usage

{-# LANGUAGE TemplateHaskell #-}

data Person = Person
  { _firstname :: String
  , _surname   :: String
  }
-- building lenses for _firstname and _surname
makeLenses ''Person
-- create an HasPerson class with the `firstname` and `surname` optics
makeClassy ''Person
-- create an HasFirstname and HasSurname class
data Person = Person
  { _PersonFirstname :: String
  , _PersonSurname   :: String
  }
makeField ''Person

Lens

A lens is a first-class reference to a subpart of some data type.

Lens' s a operates on a container s and put the focus on 'a'. Lens' s t a b when you replace a in s with b, its type changes to t.

Note that lenses are not accessors but focusers. It focus on a particular location of a structure. These are the types we want for view, set and over/update:

view :: Lens' s a -> s -> a
set :: Lens' s a -> a -> s -> s
over :: Lens' s a -> (a -> a) -> s -> s

The big insight is the fact that the Lens' type can be implemented as an unique type that works for all 3 methods (given we add a functor constraint). It is actually a type synonym for:

type Lens' s a = forall f. Functor f => (a -> f a) -> s -> f s (1)
1 This is a kind of a lifting from the element (a → f a) to the container (s → f s)
Lenses form a category where . is composition and id is the identity.
Examples
> over _1 (++ "!!!") ("goal", "the crowd goes wild")
> ("goal", "the crowd goes wild") & _1 %~ (++ "!!!") (1)
("goal!!!", "the crowd goes wild")

> ("world", "world") & _1 .~ "hello" & _2 .~ "hello" (1)
1 & allows to start the expression from s and then compose. It is defined as the reverse of $ operator.

Common operators

^.

view

^?

preview

^..

toListOf

.~

set

%~

over

.=

state monad view

Traverse

Traversals are Lenses which focus on multiple targets simultaneously. We actually don’t know how many targets they might be focusing on: it could be exactly 1 (like a Lens) or maybe 0 (like a Prism) or several. In that regard, a traversal is a like a Lens' except weaker (more general):

type Traversal' a b =
    forall f . (Applicative f) => (b -> f b) -> (a -> f a)
firstOf/lastOf traverse :: Traversable t => t a -> Maybe a

> firstOf traverse [1,2,3]
1
> [1..8] & lastOf traverse
8
toListOf (^..)

view list of targets

preview (^?)

like view for Prism’s or Traversal’s. It handles access that focuses on either 0 or 1 targets.

Prims

Prisms are kind of like Lenses that can fail or miss.

Note how the monoid instance of String allows us to get a native String from this expression:

> s = (Left "hello", 5)
> s ^. _1._Left
"hello"
> s ^. _1._Right
""

But without a monoid instance it cannot work and the (^?) is necessary:

> s = (Left 5, 5)
> s ^? _1._Left
Just 5
> s ^? _1._Right
Nothing
> :t preview _Right (Right 1)
Num b => Maybe b

Utils

-- create the nested Map when it is missing:
Map.empty & at "hello" . non Map.empty . at "world" ?~ "!!!"
-- > fromList [("hello",fromList [("world","!!!")])]

Monads

Reader

State

The State monad is just an abstraction for a function that takes a state and returns an intermediate value and some new state value:

newtype State s a = State { runState :: s -> (a, s) }

It is commonly used when needing state in a single thread of control. It doesn’t actually use mutable state and so does not necessary operate in IO.

ST

The ST[6] monad lets you use update-in-place, but unlike IO it is escapable. This means it uses system trickery to ensure that mutable data can’t escape the monad; that is, when you run an ST computation you get a pure result.

ST actions have the form:

-- an ST action returning a value of type a in state t
newtype ST s a = ST (Store s -> (a, Store s))
 -- a mutable variable in thread s
data STRef s a = STRef (MutVar# s a)

newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()

The reason ST is interesting is that it’s a primitive monad like IO, allowing computations to perform low-level manipulations on bytearrays and pointers. This means that ST can provide a pure interface while using low-level operations on mutable data, meaning it’s very fast. From the perspective of the program, it’s as if the ST computation runs in a separate thread with thread-local storage.

Exceptions

Types

Synchronous exceptions

Generated as a result of a failing action in IO (same thread). Usually thrown using throwIO.

Impure exceptions

Thrown in pure code by partial function. Ideally, we would not use such functions, a better practice is to return an Either type in this situation.

Asynchronous exceptions

Can occur anywhere, including in pure code. Generated when another thread or the runtime system is trying to kill the current thread (via throwTo) or report an “unrecoverable” situation like a StackOverflow.

Interruptible actions

Some operations are interruptible by async exceptions even within a mask. This is the case for blocking functions such as takeMVar but also for most I/O operations dealing with the outside world.

Primitives

Throwing
throwIO :: Exception e => e -> IO a
Catching
try :: Exception e => IO a -> IO (Either e a)

catch  :: Exception e
        => IO a        -- ^ computation
        -> (e -> IO a) -- ^ handler
        -> IO a
  • catch has an implicit mask around the handler.

  • try does not have a similar default. Don’t use it for recovering from an async exception.

Cleanup
finally
        :: IO a -- ^ computation
        -> IO b -- ^ computation to run afterward even if an exception was raised
        -> IO a
a `finally` sequel =
  mask $ \restore -> do
    r <- restore a `onException` sequel
    _ <- sequel
    return r

-- | Like 'finally', but only performs the final action if there was an
-- exception raised by the computation.
onException :: IO a -> IO b -> IO a
onException io what =
  io `catch` \e -> do _ <- what
                        throwIO (e :: SomeException)
Acquiring
bracket
        :: IO a        -- ^ acquire resource
        -> (a -> IO b) -- ^ release resource
        -> (a -> IO c) -- ^ use resource
        -> IO c
bracket before after use =
  mask $ \restore -> do
    a <- before
    r <- restore (use a) `onException` after a
    _ <- after a
    return r

Monad primitives

The exceptions package defines Control.Monad.Catch with

MonadThrow
class Monad m => MonadThrow m where
  throwM :: Exception e => e -> m a
MonadCatch
class MonadThrow m => MonadCatch m where
  catch :: Exception e => m a -> (e -> m a) -> m a
MonadMask
class MonadCatch m => MonadMask m where
  mask :: ((forall a. m a -> m a) -> m b) -> m b
  uninterruptibleMask :: ((forall a. m a -> m a) -> m b) -> m b
  • Instances should ensure that, in the following code f ‘finally’ g, the action g is called regardless of what occurs within f, including async exceptions.

  • ExceptT is not an instance of MonadMask. See MonadMask vs MonadBracket

Shake

Example

shake avoids rebuilding when it is not necessary. To achieve this goal, it needs to know about files dependencies. Let’s take as an example, the task of running a test suite.

In the following example you will need to define dependencies in steps as such:

  1. You need to display a test report, let’s call it 'build.last'

  2. Building build.last requires calling an external command for each nodes.

buildDir = "_build"

main = shakeArgs shakeOptions{shakeFiles=buildDir </> "_shake"} $ do

  daemon <- liftIO $ initDaemon pref (1)

  "test" ~> do (2)
    content <- readFile' (buildDir <> "/build.last") (3)
    putNormal content
    let hasFailure = any (\i -> "x" `isPrefixOf` i) (lines content)
    if hasFailure
      then fail "The build has failed !"
      else liftIO $ putDoc (dullgreen "All green." <> line)

  buildDir <> "/build.last" %> \out -> do
    Right nx <- liftIO $ runExceptT $ getNodes pdbapi QEmpty
    let deps = [ buildDir <> "/" <> Text.unpack n <> ".node" | n <- nx ^.. traverse.nodeInfoName]
    need deps
    Stdout stdout <- quietly $ cmd ("cat"::String) deps
    writeFileChanged out stdout

  buildDir <> "//*.node" %> \out -> do
    let node = dropDirectory1 (dropExtension out)
    facts <- liftIO $ mergeFacts (pref ^. prefFactsDefault) (pref ^. prefFactsOverride) <$> F.puppetDBFacts (Text.pack node) pdbapi
    r <- liftIO $ getCatalog daemon (Text.pack node) facts
    deps <- liftIO $ Set.fromList .HM.keys <$> getStats (parserStats daemon)
    need $ Text.unpack <$> Set.toList deps
    case r of
      S.Right _ ->
        liftIO $ withFile out WriteMode (\h -> hPutDoc h ( dullgreen "✓" <+> text node <> line))
      S.Left msg ->
        liftIO $ withFile out WriteMode (\h -> hPutDoc h ( char 'x' <> space <> red (text node) <> line <> indent 2 (getError msg) <> line))
1 Each build would execute this line TODO: Is there a way to avoid this ?
2 One of the top target. It is a phony rule because it does not produce anything. <3>

Turtle

Streams

The Shell type represents a stream of values. You can think of Shell as [] + IO + Managed.

newtype Shell a = Shell { _foldIO :: forall r . FoldM IO a r -> IO r }

You invoke any external shell commands using proc or shell. If you don’t care about throwing an error instead of returning the error code you would use procs and shells; proc(s) is more secure but it won’t do any shell string interpolation.

shell
    :: Text        -- Command line
    -> Shell Line  -- Lines of standard input to feed to program
    -> io ExitCode

shells :: Text-> Shell Line -> io ()
select ::        [a] -> Shell a
liftIO ::      IO a  -> Shell a
using  :: Managed a  -> Shell a

-- usual construction primitive
empty :: Shell a

-- consume the stream by printing it to stdout
view   :: Show a => Shell a -> IO ()
stdout :: Shell Text -> IO ()

-- consume the (side-effect) stream, discarding any unused values
sh :: MonadIO io => Shell a -> io ()

You can simulate piping the result of a command with inshell or inproc:

inshell :: Text -> Shell Line -> Shell Line

inproc "curl" ["-s"
              , "http://"
              ] empty (1)
    & output "filename.ext" (2)
1 keep the result of a command as a stream
2 pipe and copy
When using inshell you lose the ability to care about the exit code of the command that produces the stream.

Shell is also an instance of MonadPlus (and thus Alternative). So you can concatenate two Shell streams using <|>.

Folding

Whenever you peek into the value of a shell stream using you are effectively looping over all values (as the list monad does). For instance this code is bogus :

bogus
do
  found <- testpath =<< find (prefix (text "/home/vagrant/zsh")) "/home/vagrant"
  unless found $ ...

You will need to consume the stream and one good way to do so is using fold from the foldl package:

import qualified Control.Foldl as Fold

main = do
  not_found <- fold (find (prefix (text "/home/vagrant/zsh")) "/home/vagrant") Fold.null
  when (not_found) $ do ...

Similarly here is an utility function that checks if a file is empty:

isFileEmpty :: MonadIO io => FilePath -> io Bool
isFileEmpty path =
  fold (input path) Fold.null

FilePath

Turtle is using the deprecated system-filepath package to handle filepaths in a more secure way[7]. Watch out as it is at time a bit surprising:

common traps
let first = "/home/vagrant" :: FilePath
     second = "/plugin" (1)
     test = first <> second -- "/plugin"

eclipseVersion = "4.5" (2)
let fp = "foo" </> "bar_" <> eclipseVersion </> "plugin" -> foo/bar_/4.5/plugin (3)
1 don’t start with a / as it means you want to concatenate an absolute path
2 give you an filepath type automatically thanks to the IsString instance
3 in system-filepath <> and </> are both alias for append

When appending filepath and text the best strategy is probably to keep the filepath encoding and then convert to text if necessary:

let path = "foo" </> fromText eclipseVersion </> "plugin"
    _path = format fp path
Use </> for appending filepaths, use <> for appending text.

Command line options

data Command
  = Console
  | Stack (Maybe StackName, StackCommand)
  deriving (Show)

commandParser :: Parser Command
commandParser =
      Console <$  subcommand "console" "Help msg" (pure ())
  <|> Stack   <$> subcommand "stack" "Help msg" stackParser (1)
1 remaining parser (after 'stack')

When using a group you will need a single datatype to extract the value of the rest of the command. Don’t do this:

data Command = Stack Int Text

commandParser :: Parser Command
commandParser = Stack <$> subcommand "stack" "Help" intParser <*> textParser

Pipes

Primitives

StateT
newtype StateT s m a = StateT {
    runStateT :: s -> m (a, s)
}
Free Monad
data Free f a = Free (f (Free f a)) | Pure a

liftF :: Functor f => f a -> Free f a
Void

is the inhabited type and denote a closed output

Proxy

Pipes defines a single type Proxy which is a monad transformer:

 (Proxy p) => p a' a b' b m r

  Upstream | Downstream
     +---------+
     |         |
 a' <==       <== b'
     |  Proxy  |
 a  ==>   m   ==> b
     |         |
     +----|----+
          r
type Effect = Proxy X () () X
runEffect :: (Monad m) => Effect m r -> m r

Effect is a proxy that never yield or wait. The default API exposes a pull-based unidirectional flow.

Producer

A Producer is a monad transformer that extends any base monad with a yield command. yield emits a value, suspending the current Producer until the value is consumed. If nobody consumes the value (which is possible) then yield never returns.

type Producer b m r = Proxy X () () b m r
     +---------+
     |         |
Void <==       <== ()
     |  Proxy  |
 ()  ==>       ==> b
     |         |
     +---------+
yield :: (Monad m) => b -> Producer' b m ()

for :: (Monad m)
    =>       Proxy x' x b' b m a'
    -> (b -> Proxy x' x c' c m b')
    ->       Proxy x' x c' c m a'


-- "into" compose the bodies of `for`
(~>) :: (Monad m)
     => (a -> Producer b m r)
     -> (b -> Producer c m r)
     -> (a -> Producer c m r)
(f ~> g) x = for (f x) g
~> and yield form a Category ("Generator") where yield is the identity.

With for you consume every element of a Producer the exact same way. If this is not suitable, use next or a Consumer.

Think of next as pattern matching on the head of the Producer. This Either returns a Left if the Producer is done or it returns a Right containing the next value, a, along with the remainder of the Producer:

next :: Monad m => Producer a m r -> m (Either r (a, Producer a m r))

##= Consumer

A consumer represents an "exhaustible" (it may refuse to accept new values) and possibly effectful sink of values. An example of an exhaustible sink is toOutput from pipes-concurrency, which will terminate if the Output it writes to has been sealed.

await blocks waiting for a new value. If nobody provides it (which is possible) then await never returns.

type Consumer a = Proxy () a () X
     +---------+
     |         |
 () <==       <== ()
     |  Proxy  |
 a  ==>       ==> Void
     |         |
     +---------+
await :: Monad m => Consumer' a m a
(>~)

Repeatedly feeds await in the consumer with the action passed as the first parameter. This allows consumer composition

Examples
runEffect $ lift getLine >~ stdoutLn
        +- Feed             +- Consumer to      +- Returns new
        |  action           |  feed             |  Effect
        v                   v                   v
(>~) :: Effect m b       -> Consumer b m c   -> Effect m c
(>~) :: Consumer a m b   -> Consumer b m c   -> Consumer a m c
(>~) :: Producer y m b   -> Pipe     b y m c -> Producer   y m c
(>~) :: Pipe     a y m b -> Pipe     b y m c -> Pipe     a y m c
(>~) and await form a Category where await is the identity.
Pipe
type Pipe a b = Proxy () a () b
     +---------+
     |         |
 () <==       <== ()
     |  Proxy  |
 a  ==>       ==> b
     |         |
     +---------+
(>->) :: Monad m => Producer a m r -> Consumer a m r -> Effect m r
(>->) :: Monad m => Producer a m r -> Pipe   a b m r -> Producer b m r
(>->) :: Monad m => Pipe   a b m r -> Consumer b m r -> Consumer a m r
(>->) :: Monad m => Pipe   a b m r -> Pipe   b c m r -> Pipe   a c m r

cat :: (Monad m) => Pipe a a m r
cat = forever $ do
    x <- await
    yield x
(>→) and cat form a Category where cat is the identity.
Bidirectional API
The response category
yield = respond
for = (//>)
(~>) = (/>/)
The reply category
await = request ()

Lift

StateP

Run StateT in the base monad of the Proxy passed as a second argument.

runStateP
    :: (Monad m)
    => s -- state (usually of type proxy)
    -> Proxy a' a b' b (S.StateT s m) r
    -> Proxy a' a b' b m (r, s)
Example
-- !! this return a Producer a m (Maybe r, Producer a m r) !!
-- This makes sense you are actually running the StateT monad from Producer a (StateT (Producer a m r) m r) r
-- r is either Just which means the original Producer is empty or Nothing which mean you should go on drawing from the original Producer
-- The top producer accumulates your split, then you have a pair of a Maybe r and your original Producer

runStateP p $ do -- p will be used to feed the underlying proxy
    -- entering a monad of the form: (Proxy (<- StateT monad <- Proxy))
    -- All computation happens inside the underlying monad that is initially fed up by the param p
    x <- lift draw -- lift the next value of the underlying proxy
    case x of -- Left if the underlying proxy is empty or Right with the drawn element
        Left  r -> return (Just r)
        Right a -> do
            yield a -- push `a onto the top proxy
            (Just <$> input) >-> (Nothing <$ takeWhile (== a))  -- start streaming values from the underlying proxy
                                                                --

Concurrent API

You have got a mailbox !

(output, input) <- spawn Unbounded
producer >-> (consumer) output >...> input (producer) >-> consumer

Send to the mailbox using toOutput output (output is able to sent mail). So toOutput transforms the output into a consumer. Read from the mailbox using fromInput input (input is able to receive mail). So fromInput transforms the input into a producer.

newtype Input a = Input { recv :: S.STM (Maybe a) }

Pipes-Handle

Pipes-handle models the input/output stream analogy. An output stream accepts bytes (you write into it) whereas you read from an inputstream. The proxy that can "read from" in the pipes ecosystem is the consumer. By analogy, an output stream accepts output bytes and sends them to some sink. So you write into an output stream.

Stream

Pipes-Parse

Parser

Parser is like Consumers but with the ability to keep the leftover

type Parser a m r = forall x . StateT (Producer a m x) m r

draw :: (Monad m) => Parser a m (Maybe a)

runStateT  :: Parser a m r -> Producer a m x -> m (r, Producer a m x)
evalStateT :: Parser a m r -> Producer a m x -> m  r
execStateT :: Parser a m r -> Producer a m x -> m (   Producer a m x)
Lenses

Lenses served as transformation in both directions.

splitAt
    :: Monad m
    => Int
    -> Lens' (Producer a m x) (Producer a m (Producer a m x))
zoom

Connect lenses to Parsers

zoom
    :: Lens' (Producer a m x) (Producer b m y)
    -> Parser b m r
    -> Parser a m r

Iso': don’t provide them if there is error messages involved in encoding and decoding. Stick to Lens'

Pipes-Group

FreeT nests each subsequent Producer within the return value of the previous Producer so that you cannot access the next Producer until you completely drain the current Producer.

split / transform / join paradigm

-- A "splitter" such as `groupBy`, `chunksOf` or `splitOn`
Producer a m ()           -> FreeT (Producer a m) m ()  ~   [a]  -> [[a]]

-- A "transformation" such as `takeFree`
FreeT (Producer a m) m () -> FreeT (Producer a m) m ()  ~  [[a]] -> [[a]]

-- A "joiner" such as `concat` or `intercalate`
FreeT (Producer a m) m () -> Producer a m ()            ~  [[a]] ->  [a]

Errors management

Empty Bytestring

If you want to transform a Producer of ByteString into another Producer, for instance of csv records, be careful to be immune of empty bytestring chunks. Indeed pipes-bytestring operations don’t guarantee that they won’t drop empty bytestring chunks or create new ones.

-- first take the next elem of the source
x <- lift (next source)
        case x of
            Left () -> feedParser (k B.empty) (return ())
            Right (bs, source') ->
                if (B.null bs)
                then continue k source'
                else feedParser (k bs) source'
Managed

You have a resource a that can be acquired and then released.

-- | A @(Managed a)@ is a resource @(a)@ bracketed by acquisition and release
newtype Managed a = Manage
    { -- | Consume a managed resource
      with :: forall x . (a -> IO x) -> IO x
    }
Resource ((forall b. IO b -> IO b) -> IO (Allocated a))

Arrows and push based pipe

Events are discrete ← PUSH based.
Behaviors are continuous ← PULL based

ArrowChoice corresponds to concurrency and Arrow corresponds to parallelism

Controller/Model/View

Controller

Represent concurrent effectful inputs to your system. A controller is really just a synonym for an Input from pipes-concurrency. So you have this function:

producer :: Buffer a -> Producer a IO () -> Managed (Controller a)
Model

A pure streaming transformation from the combined controller to the combined views. You can test this pure kernel by swapping out controllers with predicable inputs.

asPipe :: Pipe a b (State s) () -> Model s a b
View

Handles all effectful outputs from the model.

asSink :: (a -> IO ()) -> View aa
Run it
runMVC ::
  initialState
  -> Model s a b
  -> Managed (View b, Controller a)
  -> IO s

Questions

type Producer b =                    Proxy Void () () b
type Producer' b m r = forall x' x . Proxy x' x () b m r

Dhall

Dhall is a programming language specialized for configuration files.

function
let double = \(n : Natural) -> n * 2 in double 4
/config/box
{ userName         = ""
, userEmail        = ""
, userStacks       = ["bos", "irisbox"]
, plugins          = True
, mrRepoUrl        = "git://github.com/CIRB/vcsh_mr_template.git"
}
{-# LANGUAGE DeriveGeneric #-}

data BoxConfig
  = BoxConfig
  { _userName        :: Text
  , _userEmail       :: Text
  , _repos           :: Vector Text (1)
  , _eclipse         :: Bool
  } deriving (Generic, Show)

makeLenses ''BoxConfig

instance Interpret BoxConfig
1 Dhall uses vector instead of list
main :: IO ()
main = do
  box_config  <- Dhall.input auto "./config/box"
  configure (box_config^.userName) (box_config^.userEmail)
#! /usr/bin/env bash
readarray arr <<< $(dhall <<< '(./config/box ).repos' 2> /dev/null | jq -r 'join (" ")')
for s in ${arr}; do
    echo "$s"
done

Naming convention

Algebraic Data Type

Quite common (used in pipes, ekmett, servant, tibbe):

data List a
  = Cons a (List a)
  | Nil

Also used for simple sum or product declaration:

data MySum = A | B
data MyProduct = MyProduct Int String

Record

The most common (used in the lens, fpcomplete, servant, tibbe, hindent):

data Person = Person
  { _firstName :: String -- ^ First name
  , _lastName  :: String -- ^ Last name
  } deriving (Eq, Show)

In order to mimic ADT and to make it easy with the haskell-indentation we could go with this instead (but it is less common !):

data Person
  = Person
  { _firstName :: String
  , _lastName  :: String
  } deriving (Eq, Show)

Module

module Puppet.Parser (
         expression
       , puppetParser
       , runPParser
       ) where

Idioms

Maybe

Use of a case to pattern match a maybe value is quite common:

  readline >>= \case
    Just "Y" -> pure ()
    _        -> die "Abort"

You might want to define a unwrapWith utility mimicking rust unwrap_with but it would be limited and unpractical:

-- | Unwrap a maybe value in an io computation
-- passing an alert action in case of Nothing
unwrapWith :: MonadIO io => io a -> Maybe a -> io a (1)
unwrapWith io_alert v = maybe io_alert pure v
<1> Note how `a` fixes the input/output

At the end of the day it is better to stick with the 'case pattern-matching' idiom even for simple cases and avoid the less readable maybe variant:

 readline >>= \case
   Nothing -> die "Abort"
   Just v  -> pure v

  readline >>= maybe (die "Abort") pure (1)
1 shorter but arguably more cryptic

Extensions

Syntax
  • LambdaCase

  • GADTSyntax

  • RankNTypes

  • BangPatterns

  • RecordWildCards

  • DuplicateRecordFields

  • MultiWayIf

Common
  • ScopedTypeVariables (69)

  • DeriveGeneric (66)

  • TupleSections (59)

  • MultiParamTypeClasses (56)

  • FlexibleInstances (33)

  • FlexibleContexts (31)

  • TypeOperators (29)

  • FunctionalDependencies (15)

Unknown
  • MonadComprehensions (12)

  • EmptyCase (3)

  • DisambiguateRecordFields (14)

  • RecursiveDo (10)

  • ParallelListComp

  • TypeFamilies,

  • PatternSynonyms

  • PartialTypeSignatures

  • TypeApplications

Common functions

-- give a default and always get an a from a maybe value
maybe:: a -> Maybe a -> a

GATS

GADTs let you associate different type parameters for different data constructors of a type.

For example, imagine we represent simple language terms that can only be bool/int literals and And/Add operations between those:

data Expr = ExprInt Int
          | ExprBool Bool
          | ExprAnd Expr Expr
          | ExprAdd Expr Expr

This would let us do invalid things like:

ExprAnd (ExprInt 5) (ExprBool True)

Firstly and less importantly, GADTs let us write the same definition using a different notation:

data Expr where
  ExprInt  :: Int -> Expr
  ExprBool :: Bool -> Expr
  ExprAnd  :: Expr -> Expr -> Expr
  ExprAdd  :: Expr -> Expr -> Expr

The real point of this notation is that it is an opportunity to associate different constructors of Expr with different type constraints and type parameters: So you restrict the return value of the

data Expr  :: * -> * where
  ExprInt  :: Int -> Expr Int
  ExprBool :: Bool -> Expr Bool
  ExprAnd  :: Expr Bool -> Expr Bool -> Expr Bool
  ExprAdd  :: Expr Int -> Expr Int -> Expr Int

This rules out non-sensical terms like:

ExprAnd (ExprInt 5) (ExprBool True)

Additionally, GADTs let you add type-class constraints and forall’d variables to each of the constructors. For example, let’s say we want to represent a length-indexed list:

data LenList :: Nat -> * -> * where
  Nil :: LenList 0 a
  Cons :: a -> LenList n a -> LenList (1 + n) a

Note that not only do the 2 differing constructors have differing type params (0/1+n), they also have constraints linking the "n" from the "LenList" type index (aka type parameter) to the "n" of the given list.

Another important facet of GADTs is that all this extra information is not just used to type-check value constructions as shown above. It also gives you back type information when you do case analysis. i.e:

case myLenList of
  Nil       -> ... -- ^ the type of myLenList in this case is inferred to (LenList 0 a)
  Cons x xs -> ... -- ^ the type of myLenList in this case is inferred to
                   (LenList (1 + n) a) and the type of xs is inferred to (LenList n a)
 f
To reiterate, the type of the term we're case analyzing is inferred differently according to runtime values (which constructor is chosen).

Lastly, by allowing interesting types and constraints on each constructor, GADTs implicitly allow existential quantification, and storing of type-class instances inside values.

For example, this existentially quantified (and mostly useless) type:

data SomeShowable = forall a. Show a => MkSomeShowable a

Can be represented with GADTs as:

data SomeShowable where
  MkSomeShowable :: Show a => a -> SomeShowable

Note the forall a. can be left implicit in the GADT version.

Interestingly, with GADTs, you can have existential quantification only in some of your constructors. You can have differing type-class instances stored inside different constructors. When you pattern-match your GADT constructor, the instance implicitly comes into scope.

Operational

think of monads as sequences of primitive instructions.

Operator colloquial name

>>=

bind

>>

then

*>

then

to

a → b: a to b

bind

(as it desugars to >>=)

<$>

(f)map

<$

map-replace by

0 <$ f: "f map-replace by 0"

<*>

ap(ply)

$

apply to or of

.

after

a . b $ c: "a after b applied to c"

!!

index

!

index, strict

a ! b: "a index b", foo !x: foo strict x

<|>

or, appbin

expr <|> term: "expr or term"

++

append

[]

empty list

:

cons

::

of type

f x :: Int: f x of type Int

\

lambda

@

as

go ll@(l:ls): go ll as l cons ls

~

lazy

go ~(a,b): go lazy pair a, b

>=>

fish

<=<

left fish

Developments

generate TAGS
hasktags -e src

Blast

Two main ideas:

  • build two 'isomorphic' AST for the slave and the master. The nodes (shape) is the same but each node is slighlty different in the master and slaves

  • find a way so that the execution of f x can be done in a typed safed way on the slaves using the cache

  • you know from the start how many slaves you have

    Kubernetes

    Chaque pod dans Kubernetes possède une adresse IP unique → One slave == one pod

One slave is failing, you get another one inside the pod → same IP

Spacemacs

Cheatsheet

Magit

<SPC> g e

Ediff a file (awesome)

<SPC> g d r

Revert hunk

Helm

c-k

"abort" helm completion !!

Ido

c-Return

dired

c-o

| open in a new buffer

c-s

- open in a new buffer

<SPC> s l

helm-semantic-or-imenu

<SPC> s w g

search with google

<SPC> s f

search within a path

<SPC> s s

helm-swoop

Misc

<SPC> v

expand region mode

<SPC> a u

"undo-tree-visualize"

<SPC> p f

helm-find file for your project !amazing!

<SPC> f e h

open helm spacemacs for help

<SPC> f S

save all buffers

<SPC> f y

show file name

<SPC> i K

insert empty line above

<SPC> b b

helm find buffer

<SPC> x J / SPC x K

move lines up and down i

<SPC> r y

kill ring

,gg

jump to def (awesome!)

<SPC> p y

find tags

<SPC> /

ag search

<SPC> m c R

reload .spacemacs

<SPC> f e d

open .spacemacs

<SPC> n r

narrow region

(define-key evil-normal-state-map (kbd "C-p C-b") 'ibuffer)

Surround

Enter visual state with v>> e for selecting expression >> s for surrounding >> ) with parents without extra spaces.

Tramp

/ssh:saltmaster_testing|sudo:root@saltmaster:/srv/myfile.sls

Replace on multiple files

Docker

Intro

Containers work by isolating the differences between applications inside the container so that everything outside the container can be standardized. At the core of container technology are cGroups and namespaces. Control groups work by allowing the host to share and limit the resources each process or container can consume. Processes are limited to see only the process ID in the same namespace.

A Docker environment is made up of filesystems layered over each other. At the base is a boot filesystem, docker’s next layer is the root filesystem, rootfs. Then Docker takes advantage of a union mount to add more readonly filesystems on top. These filesystems are called "images". Finally, Docker mounts a read-write filesystem on top of any layers below. This is where whatever process we want our Docker container to run will execute.

User images are named using "initial/name:tag"

The RUN instruction in a Dockerfile executes commands on the current image and commits the results.

Useful command

docker build -t initial/name .
docker commit containerid imagename
docker ps
docker images
docker run -i -t initial/name /bin/bash
docker run -d --net compose_default puppet/puppet-agent-centos  (1) (2)
docker exec (3)
1 -d for detached (will run in the background)
2 --net compose_default specify the network (this one is created by default by docker-compose) <3>

Links is used to enable secure communication between two containers. The first container is oddly enough called the child. This is odd because it is usually a server and it has to be started first …​ The first container will expose a port and be labelled with a name.

# Child or first container
sudo docker run -i -t -h puppet -name puppetmaster pra/pmaster /bin/bash

# Parent or second container have all info to connect to the first
sudo docker run -i -t -h minion -name minion -link puppetmaster:puppet pra/minion /bin/bash

SSH-tunnel

ssh -q -M -S my-ctrl-socket -fnNT -L 27017:localhost:27017 alhazen@pulp.irisnet.be

# to use the host network: --net host
docker run --net host -e PULP_LOGIN=$(PULP_LOGIN) -e PULP_PWD=$(PULP_PWD) --rm -v $(PWD):/code -ti test /code/bin/clean.py $(ENV) --repo-name=$(REPO_ID)

ssh -q -S my-ctrl-socket -O exit alhazen@pulp.irisnet.be 2> /dev/null

Export/Import

Export acts on containers ! It currently does not work from containers to images …​ It is really briddle right now (just wait for 1.0)

In the meanwhile it is possible to use any image as your base image in the Dockfile …​

Mount

You cannot mount a host dir with the VOLUME instruction inside the Dockerfile. You need to pass it at runtime :

# !! first -v, then -t !!
run -it -v /media/puppet-stack-middleware:/etc/puppet/environments/middleware_local:ro pra/puppetmaster /bin/bash

docker run -it -v /media/puppet-stack-middleware:/etc/puppet/environments/middleware_local:ro pra/puppetmaster /bin/bash

Initial Win7 host setup

Win7 hosts a docker ubuntu VM (standard install) using vagrant.

Change the Vagrantfile to mount the shared `puppet-stack-middleware`directory:

config.vm.share_folder "puppet-stack-middleware", "/media/puppet-stack-middleware", "C:/Users/pradermecker/VirtualBox VMs/shared/puppet-stack-middleware"

Connection to the docker vms from an arch vms with:

ssh -p 2222 vagrant@10.0.2.2

Create a dir puppetmaster and a file inside called Dockerfile. Build with sudo docker build .

Then you need to ssh-copy-id your public id_rsa.pub key to be able to fetch the Docker configuration from Github.

Trouble Shouting

WARNING

In centos 6.4 usePAM needs to be set to no while it needs to be set to yes in 6.5

WARNING

The Centos latest official images, currently 6.5, comes with a broken centos.plus version of libselinux. To remove it you need to:

yum downgrade --skip-broken libselinux libselinux-utils

Docker compose

Swarm node

Each node is configured by puppet and contain:

  • a container swarm running inside a docker (spawn with the docker engine daemon)

  • a docker registrator running inside a docker (spawn with the docker engine daemon)

  • a consult agent (doesn’t run within a docker)

DNS

You can use Consul as a DNS service. dnsmask is configured within each swarm node while every dockers inside a node is running with --dns 172.17.0.1.[8]

Salt

Targeting

Minion id

  • unique (FQDN by default)

  • can be overridden in the minion config file

  • if changed, P/P keys need to be regenerated

  • match by shell-style globbing around the minion id or top file

  • use single quote

  • Perl-compatible regex can be used with the -E option

salt '*.be.brussels' test.ping
salt -L 'web1,web2,web3' disk.usage
salt -E 'web[0-9]' cmd.exec_code python 'import sys; print sys.version'
base:
'web-(develjstaging)'
  - match: pcre
  - webserver

Grains

  • static bits of information that a minion collects when the minion starts

  • can be statically described in the minion config file with the option grains

  • available to Salt modules

  • automatically sync when state.highstate is called.

    salt -G 'os:CentOS' --batch-size 25% grains.item num_cpus

Node groups

  • predefined group of minions declared in the master

  • declared using compound matchers (see doc)

Salt states

Use SLS files (SaLt State) to represent the state of a system.

  • SLS files are just dictionaries, lists, strings, and numbers (HighState data structure)

  • default serialization format is YAML with the Jinja2 templating system

  • system data and function can be used via salt, grain and pillar

  • files are combined to form a salt state tree using source, include and extend

declaration-id: (1)
  pkg:
    - installed
  service:
    - running
    - watch: (2)
    - pkg: apache
    - file: /etc/httpd/conf/httpd.conf

/etc/httpd/conf/httpd.conf:
  file.managed:
    - source: salt://apache/httpd.conf
    - user: root
    - group: root
    - mode: 644
1 declaration-id set the name of the thing that needs to be manipulated
2 watch & require to manage order and events
# given a sls web/apache.sls
salt '*' state.sls web.apache

Salt file server & top file & environment

The top file is used to map what modules get loaded onto what minions

base: (1)
  'bmob': (2)
    - packages (3)
1 environment
2 target for state.highstate
3 sls module name

The file server is suitable for distributing files to minions

file_roots:
  base:
    - /srv/salt

External Auth

# The external auth system
external_auth:
  ldap:
    GP_APP_JENKINS%:
         - 'test.*'
         - 'grains.*'
         - 'pillar.*'
    pradermecker:
      - 'G@hostname:middleware': (1)
         - '.*'
         - '@runner' (2)
         - '@wheel'
         - '@jobs'
    jfroche:
         - 'saltutil.*'
         - '@runner'
         - '@wheel'
         - '@jobs'

auth.ldap.basedn: OU=ACCOUNTS,OU=CIRB-CIBG,DC=ad,DC=cirb,DC=lan
auth.ldap.binddn: CN=<%= @ldap_name %>,OU=Saltmasters,OU=Apps,OU=Service_Groups_Accounts,OU=ACCOUNTS,OU=CIRB-CIBG,DC=ad,DC=cirb,DC=lan
auth.ldap.bindpw: <%= @ldap_pwd %>
auth.ldap.filter: (sAMAccountName={{username}})
auth.ldap.port: 389
auth.ldap.server: svidscavw003.prd.srv.cirb.lan
auth.ldap.tls: False
auth.ldap.no_verify: True
auth.ldap.activedirectory: True
auth.ldap.groupclass: group
auth.ldap.accountattributename: sAMAccountName
auth.ldap.persontype: person
1 Define the allow targets (compount). No relation to the salt notion of environment.
2 Access to the runner module but this work only via the salt-api On the command line, salt-run does not support the pam or ldap flag.

Standalone minions

Minion can run without master. In the minion config file, set the option file client: local

By default the contents of the master configuration file are loaded into pillar for all minions, this is to enable the master configuration file to be used for global configuration of minions. To disable the master config from being added to the pillar set pillar_opts to False.

Master Event

event = salt.utils.event.MasterEvent('/home/vagrant/projects/jules/var/run/salt/master')
event.get_event(wait=20, tag='salt')

Pillars

The data can be arbitrary. The pillar is built in a similar fashion as the state tree, it is comprised of sls files and has a top file, just like the state tree. The default location for the pillar is in /srv/pillar ("pillar_roots" master config key).

GITFS

When using the gitfs backend, Salt translates git branches and tags into environments, making environment management very simple.

fileserver_backend:
  - git

gitfs_remotes:
  - http://stash.cirb.lan/scm/middleware/salt-stack.git

Salt API

curl -si salt.sta.srv.cirb.lan:8000/login \
        -H "Accept: application/json" \
        -d username='jfroche' \
        -d password='xMLrzzzz' \
        -d eauth='pam' > /tmp/cookies.txt
curl -b /tmp/cookies.txt -si salt.sta.srv.cirb.lan:8000 \
    -d client='runner' \
    -d mods='orchestration.bootstrap-puppet' \
    -d fun='state.orchestrate' \
    -d eauth='pam'

curl -ssik https://salt.sta.srv.cirb.lan:8000/run  \
      -H 'content-type: application/json' -H 'Accept: application/x-yaml'  -d '[{
      "username": "xxx",
      "password": "xxxxxx",
      "eauth": "ldap",
      "client": "runner",
      "fun": "doc.execution"
     }]'

Orchestration

[main]
SALTAPI_URL=http://saltmaster.sandbox.srv.cirb.lan:8000
SALTAPI_USER=pradermecker
SALTAPI_PASS=pass
SALTAPI_EAUTH=pam
salt-run state.orchestrate orch.test saltenv=middleware (1)
pepper '*' test.ping
pepper 'puppetmaster2*'  grains.item subgroup role
pepper --client=runner state.orchestrate mods=orchestration.bootstrap-puppet
1 pick up the gitfs branch that host orch.test source

set_puppet_role_to_master:
    salt.function:
        - name: utils.set_role
        - tgt: 'G@role:server and G@subgroup:puppet'
        - kwarg:
            role: master
        - require:
          - salt: run_saltmaster

# /srv/salt/orch/test-puppet.sls
run_puppet_jenkinsmaster:
    salt.state: (3)
        - sls:
          - puppet (4)
        - tgt: 'G@role:master and G@subgroup:jenkins'
        - tgt_type: compound

ping_saltmaster:
    salt.function: (1)
        - name: test.ping
        - tgt: 'role:saltmaster'
        - tgt_type: grain
        - require: (2)
           - salt: run_puppet_jenkinsmaster

# /srv/salt/puppet.sls:
puppet:
    module.run:
        - name: cmd.run
        - arg:
           - 'puppet agent --verbose --onetime --no-daemonize --color false'
1 To execute a function, use salt.function
2 Force order
3 To execute a module, use salt.state
4 Execute the module /srv/salt/puppet.sls

Salt SSL

make salt-ssh HOST=jenkins2 ZONE=prod CMD="state.sls utils.migrate_puppet3"

Useful commands

salt '*' saltutil.sync_all
pep 'svappcavl704.dev.srv.cirb.lan' cmd.run "cat /etc/salt/master" | jq '.return[]' | jq -r '.[]'
pep 'svappcsvl028.prd.srv.cirb.lan' cmd.run "cat /etc/salt/master" | jq '.return[]' | jq -r '.[]'

Postgrest

http://pgserver.sandbox.srv.cirb.lan:3000/jids?jid=eq.20150831150415858891
http://pgserver.sandbox.srv.cirb.lan:3000/salt_returns?full_ret->>jid=eq.20150831150437889173

Install PRD / Bootstrap

## get salt/puppet version we want
## We do need to update puppet because the current salt config does not work wih < 3.8
yum versionlock delete 0:*
yum install salt-master salt-minion puppet
# temp /etc/hosts to point to the new salt master
systemctl start salt-master
systemctl start salt-minion
salt '*' saltutil.sync_all

## we need to manually change the config of /etc/salt/master:
#
#  file_roots:
#    base:
#      - /srv/salt/
#    middleware:
#      - /srv/salt/middleware

## new puppetmaster, foreman, puppetdb, pgserver
# temp /etc/hosts to point to the new salt master

# we still need to manually
yum makecache fast
yum update -y
yum clean all

# we still need to manually
mkdir -p /etc/facter/facts.d/
vim /etc/facter/facts.d/host-info.txt

# and finally we need piera to get hiera data before we can bootstrap ...


## Do test every pings are working correctly

salt-run state.orchestrate orch.ping saltenv=middleware

## There are issues when puppetconfig restart the minion during the orchestration process
## Let's do it manually

salt -C 'G@role:master and G@subgroup:puppet and G@hostgroup:middleware' puppetutils.run_apply  hostgroup=middleware role=server zone=prod subgroup=puppet


salt -C 'G@role:saltmaster and G@hostgroup:middleware and G@zone:prod' puppetutils.install_stackrpm hostgroup=middleware zone=prod

salt -C 'G@role:saltmaster and G@hostgroup:middleware and G@zone:prod' puppetutils.run_apply hostgroup=middleware role=saltmaster zone=prod

salt -C 'G@role:pgserver and G@hostgroup:middleware and G@zone:prod' puppetutils.run_agent hostgroup=middleware zone=prod

Issues

  • When the master restart, windows minion does not seem to be able to reconnect (without a minion restart) /etc/httpd/conf/httpd.conf:

Git

Internals

Git maintained snapshot of directory’s contents. It is a content-addressable filesystem, a simple key value data store. Keys are SHA-1 hash and values are objects.

There are 4 different types of objects:

  • Blob stores files (it does not store the name of the file)

  • Tree references other trees and/or blobs, stores the file name and groups them together (as directories do)

  • Commit points to a single tree and realize "snapshots".

  • Tag marks a specific commit

Tips

Ignore files in all projects but keep this for yourself
  1. Add to your ~/.gitconfig file

    ```
    [core]
    excludesfile = /home/username/.gitignore
    ```
  2. Create a ~/.gitignore file with file patterns to be ignored

Delete a range of tags both locally and remotely
for i in `seq 248 638`; do git tag -d $i; git push --delete origin $i;done

Linux

Script

#!/bin/bash -xe
set -euo pipefail (1)

while read host
do
       cat plone_team.pub | ssh alhazen@"$host".prd.srv.cirb.lan "grep plone ~/.ssh/authorized_keys || cat >> ~/.ssh/authorized_keys"
done < plone-prod-uniq.txt
1 exits immediately when a command fails (e) even within a pipe (o pipefail), treat unset variables as an error (u)
pandoc --latex-engine=xelatex -o blog.pdf http://blog.jakubarnold.cz/2014/07/22/building-monad-transformers-part-1.html

LVM

  • change disk size on the VCloud

  • create a new partition with fdisk (ie: sdb1) so we don’t change anything on the existing partition table

  • add this new partition as a new physical volume: pvcreate /dev/sdb1

  • vgextend system_vg /dev/sdb1

  • lvextend -L+12G /dev/system_vg/data

  • xfs_growfs /dev/system_vg/data

or by adding a new disk using puppet :

  • add a new disk on the VCloud

  • after a few delay, VCloud will automatically create a new partition for instance '/dev/sdd'

  • add this new partition as a new physical volume: pvcreate /dev/sdd. You bb can see it with pvs

  • vgextend vg_rhel /dev/sdd (the name to 'vg_rhel' is fixed for our new RHEL 7 template)

  • puppet agent -t will now create a new lv nix. You can see it with lvs

at the CIRB the easier is:

  • to ask for a machine with 40G (second disk usually /dev/sdb)

  • The machine will be received with a full vg_rhel of 40G. Go to the vcloud console and extend the second disk to 60G

  • The machine now has a /dev/sdb disk with 60G. Extends the pv using pvresize -v /dev/sdb. And check with vgdisplay or pvs.

Tips

Add route in windows
route ADD 192.168.30.0 MASK 255.255.255.0 10.255.10.4
SCP copy from local to remote
scp -i ~/.ssh/user_rsa -r folder user@svifscapl003.prd.srv.cirb.lan:/tmp
SCP copy from remote to remote

Using your local computer

ssh-add ~/.ssh/alhazen_rsa
# Give alhazen the permission to write on targetfqdn:/srv/tmp
ssh -A -i  ~/.ssh/alhazen_rsa alhazen@sourcefqdn \
"scp -o StrictHostKeyChecking=No /srv/data/pgserver.dump alhazen@targetfqdn:/srv/tmp"
SSH with password for a specific host
ssh/config
Host 192.168.xx.xx
  PreferredAuthentications password
NsLookup
$ nslookup.exe stash.cirb.lan 192.168.34.2xx (1)
Non-authoritative answer:
Server:  svidscapw000.ad.cirb.lan
Address:  192.168.34.2xx

Name:    stash.cirb.lan
Address:  192.168.34.xx
1 DNS to lookup + DNS server

Definitions

Push (SSE) vs Pull (REQ/REP)
Application layer

HTTP, SNMP, AMQP, XMPP, IRC, DHCP, WebDAV, SSH, FTP, SIP, Telnet

Transport layer

TCP, UDP (SCTP)

Logs

journalctl -r - show logs in reverse order
journalctl -b - show logs since last boot
journalctl -k - show kernel logs
journalctl -p warning - show logs with warning priority
journalctl -p error - show logs with error priority
journalctl --since=2016-08-01 - show logs since
journalctl --until=2016-08-03 - show logs until
journalctl --until=today - show logs until midnight today
journalctl --since=yesterday - show logs since yesterday midnight
journalctl --since=-2week - show logs for last 2 weeks
journalctl -u <unit-name> - show logs of certain unit
journalctl /dev/sda - show kernel message of device
journalctl -o json - show logs in json format

Postgres

General

Glossary

PEM

Postgres Enterprise Manager

PPAS

Postgres Plus Advanced Server

WAL

At all times, PostgreSQL maintains a WAL (Write Ahead Log) in the pg_xlog/ subdirectory of the cluster’s data directory. The log describes every change made to the database’s data files. This log exists primarily for crash-safety purposes: if the system crashes, the database can be restored to consistency by "replaying" the log entries made since the last checkpoint. However, the existence of the log makes it possible to use a third strategy for backing up databases: we can combine a file-system-level backup with backup of the WAL files. If recovery is needed, we restore the backup and then replay from the backed-up WAL files to bring the backup up to current time.

Architecture

One process per user, NO THREAD ! Processes are managed by Postmaster that acts as a listener for new connection and as a supervisor to restart them.

The term "buffer" is usually used for blocks in memory.

7500 concurrent user → connection pooling

16MB

Cluster

A cluster is a collection of databases. Clusters have separate: * data directory * TCP port * set of processes

To create a cluster execute the following comming with the postgres user (! not root !): [postgres]$ initdb --locale en_US.UTF-8 -E UTF8 -D '/var/lib/postgres/data'

To create a second cluster on the same machine you need to: * as root, create a DATA directory * as root, change owner of the DATA directory to enterprisedb or postgres (depending on the version of postgres enterprise or community) * as postgres (or enterprisedb), do: initdb -D '/var/lib/postgres/data'

There is a little tricky behavior with the second cluster when you want to connect with a client. By default, connections will be refused for user "enterprisedb" …​ You need to change the pg_hba file and set "trust" for enterprisedb …​ Then set a password with the client and put it pack to md5.

host all enterprisedb 192.168.104.0/24 trust

NixOS

services.postgresql = {
    enable = true;
    authentication = ''
      local saltstack all trust
    '';
  };
CREATE USER vagrant SUPERSUSER LOGIN; (1)

CREATE USER salt LOGIN; (2)
CREATE DATABASE saltstack WITH owner = 'salt';
ALTER USER salt WITH password 'saltpass';

psql saltstack -U salt (3)
1 as root
2 as vagrant
3 as vagrant, check that you can connect to the db

Tips

Allocate a multiple records select in a variable in psql

CREATE OR REPLACE FUNCTION notify_result() RETURNS TRIGGER AS $$

DECLARE
notification jsonb;
chan text;

BEGIN

-- Get the user as the name of the channel
SELECT load->>'user' into chan from jids where jid = NEW.jid;
-- This is not working because salt_returns table haven't been filled in yet ...
notification := (SELECT array_to_json(array_agg(row_to_json(t))) from (SELECT r.full_ret FROM salt_returns r where r.jid = NEW.jid) t);

-- Execute pg_notify(channel, notification)
PERFORM pg_notify(chan, NEW.jid);

-- Result is ignored since this is an AFTER trigger
RETURN NULL;
END;

$$ LANGUAGE plpgsql;

DROP TRIGGER IF EXISTS notify_result on jids;
CREATE TRIGGER notify_result AFTER INSERT ON jids
    FOR EACH ROW EXECUTE PROCEDURE notify_result();
COPY
COPY edbstore.categories TO '/tmp/test.csv' WITH (FORMAT 'csv');
Backup & Recovery

for small DB, we can use a sql dump several time a day:

pg_dump dbname | gzip > filename.gz
pg_dump dbname | split -b 1m - filename

psql dbname < infile
Shell
psql -h 192.168.14.62 -W -U pradermecker postgres < create_PGPUPPETDB.sql

export PGPASSWORD=dbpasswordforpuppetdb
ssh puppetmaster-prod 'sudo -u postgres pg_dump puppetdb ' | psql -h 192.168.14.62 -U puppetdb -w PGPUPPETDB

for t in $(pqsl -U enterprisedb -d edbstore -t -c "select tablename from pg_tables where tableowner='edbstore'"); do
  pg_dump -t edbstore.$t -U enterprisedb edbstore > $t.sql;
done

select tablename from pg_tables where tableowner='edbstore';
select table_name from information_schema.tables where table_schema='edbstore';

Replication

  • Hot Streaming Replication (Warm Streaming Replication or Log WAL Shipping is deprecated) There is a daemon process started by the PostMaster

We don’t have to start the slave before the master. The slave can just wait for a master to start up.

  1. First shutdown the master and set it up for replication by:

    1. change postgres.conf

      wal_level = hot_standby
      max_wal_senders = 4
      wal_keep_segments = 32
      archive_mode = on
      archive_command = 'cp %p /data/archive/%f'
    2. change pg_hba.conf:

      host  replication  repuser slaveip/32  md5
  2. Configure the pg_hba.conf of the slave:

    host  replication  repuser masterip/32  md5
  3. Initialize the cluster

On a local server, you can just copy the data folder from the master to the slave or pg_basebackup -h localhost -D /opt/PostgresPlus/9.3AS/data1 but on a real set up you would follow these steps:

  1. on the master:

    postgres=# select pg_start_backup('cluster_init');
  2. on the slave:

    rsync -avz --delete --inplace --exclude-from=/srv/pgsql/do-not-sync  root@195.244.165.68:/srv/pgsql/data/ /srv/pgsql/data (1)
    1 with the postgres user
  3. on the master

    postgres=# select pg_stop_backup();

" PAX process

Select * from pg_stat_activity
select * from pg

Programming Notes

Notion

Functional Programming

The meaning of the programs is centered around evaluating expressions rather than executing instructions.

This is the key to functional programming’s power — it allows improved modularization

In a functional program what is important is that it is a value oriented language; what we are building are sentences made from different values and higher order functions. The types and higher order values are defining the grammar of those sentences.

Algebraic Data Type

A struct or new type, composed from other types either as a product or a sum type.

Name Member Inhabitant

Void

0

Unit

()

1

Bool

True, False

2

Going from there you can define by sum a type with 3 elements:

data Add a b = AddL a | AddR b
-- or
data Either a b = Left a | Right b

-- if a is Bool and b is () you have got:
addValues = [AddL False, AddL True, AddR ()]

You can also use a product type with Mul:

data Mul a b = Mul a b
-- or
data (,) a b = (a, b)

mulValues = [Mul False False, Mul False True, Mul True False, Mul True True]
Abstract Data Type (ADT)

A data type is said to be abstract when its implementation is hidden from the client. ADT’s are types which encapsulates a set of operations. The concept originates from CLU (Liskov, 1972) :

Modules → Partitions → ADTs

The canonical example is a Stack for which we define a set of operations including a way to construct/get a new empty Stack.

This is very different and even dual to the concept of objects in OO. ADT operations belongs to its datatype whereas OO objects are implemented as collections of observations (methods) that can be performed upon them. The focus on observations, rather than construction, means that objects are best understood as co-algebras.

Hash Value

Hashing is a transformation AnyText → TextWithFixedSmallerSize (array of bites) called digest or hash value with the following (ideal) properties:

  • it is quick to compute

  • it is not reversible: you cannot get anyText from the digest

  • the digest is unique so that two different AnyText will always have a different digest.

The idea is to store this mapping in a database so that you use digest as a representation for AnyText (the digest becomes the id/handle for the Text). Given such a mapping you can also hash AnyText, get a digest and do a lookup in the table to see if the mapping already exists.

CAP
  • Consistency

  • Availability

  • Partition

Glossary
nibble

Half of one byte. So 4 bits/digits → 16 values

subroutine

Synonym for function

subtype
Circle <: Shape
Object Orientated Programming

Objects by definition include mutable state → intensional object identity !!

Quotes

Monitoring

Software should be designed from the start to be monitored

ORM

ORMs are mixing different concerns. There were introduced by OO zealots to avoid the declarative nature of SQL. Now according to Martin Fowler they are just a way to get memory cache. Yes right but that is not the way ORMs have been sold ?

The whole ORM story looks like a complete disaster. Building a graph of objects in memory across sessions has proven to make little sense in many projects I had worked on.

If you deal with a relational database, abstracting it with mutable pojos is dubious at best. I am pretty convinced a nice API query interface such as LINQ can solve the problems of the myriads of SQL statements to handle.

Here is the problem, at the end, data are used to be feed into viewer. So let’s get this straight. Output JSON directly from a query language interface !

Scrible

Please do understand the difference between → and ⇒ …​ what you need here is "lead to" maybe implies

RAM : Heap & Stack

IDENDITY LABEL FOR A TIMELINE CONFLATE ID WITH STATE REF TYPE BOXES TO VALUES

How do we express polymorphism in UML. You mark class with a stereotype. You have to see class as something really global in UML. It is just a blueprint of code.

Characteristics

EL is built upon the Lucene search engine. Everything is stored in an inversed index. It features:

  • HA

  • automatic index creation by generating a "mapping"

Terminology

index

All documents live in an index. An index is roughtly the same as a database which mean it is just a namespace.

type

Before version 6, a index could have on or more type. A type was like a table, a collection of similar thing. This notion of type is totally deprecated. In version 6, you still have to indicate one type (usually called _doc by convention. From version 7, the type would be optional and it will ultimately disappear from EL jargon.

document

A document is like a row. It is composed of field/value (a field is alike a column in RDB).

version

ES only keeps one version of a document. The version number is kept by ES for engineering purpose but should not be used in the applicative/business layer.

mappings

Map fields with data types.

analysis

process of converting full text into terms for the inverted index

node

An instance of EL. Usually one per machine.

cluster

A set of nodes. You might separate nodes into cluster because:

  • the usage/ownership/…​ of the data are different

  • the nodes are located in two different datacenter

shards

By default each index is divided in 5 pieces called shards. This number is defined at index creation. A document will live on a single shard. EL tries to evenly distribute documents within an index among all the shards.

A shard is a single instance of Lucene and would roughly reach for a size of about 10G.

replica

Shards are replicated usually by 2 (number of replica = 1)

segments

A shard is written on disks in multiple segment files.

Data types

  • Simple

    • text: full text analyzed strings

    • keyword: sorting/aggregation of exact values (not analyzed).

    • byte/short/integer/float/double: numeric value

    • date

    • boolean

    • ip

  • Hierarchical: object, nested

  • Range

  • Specialized: geo_point, percolator

APIs

Index search
GET blogs/_search
{
  "query": {
    "match": {
      "content": "ingest nodes"
    }
  }
}
Create your own mapping
PUT blogs
{
  "mappings": {
    "_doc": {
      "properties": {
        "content": {
          "type": "text"
        },
        ...
      }
    }
  }
}
Test the analyzer
GET _analyze
Cluster
GET _cluster/state
Update index settings
PUT blogs/_settings
{
  "settings": {
    "number_of_replicas": 0 (1)
  }
}
1 you can dynamically change the number of replicas but not the number_of_shards
Reindex
POST _reindex
{
  "source": {
    "index": "blogs",
      "query": {
        ...
      }
  },
  "dest": {
    "index": "blogs_fixed"
  }
}
Ingest
PUT _ingest/pipeline/fix_locales
{
  "processors": [
    {
      "script": {
        "source": """
if("".equals(ctx.locales)) {
  ctx.locales = "en-en";
}
ctx.reindexBatch = 3;
"""
      }
    },
    {
      "split": {
        "field": "locales",
        "separator": ","
      }
    }
  ]
}

Node roles

  • master eligible

    Only one master node per cluster. It is the sole capable of changing the cluster state. You need an odd number of eligible master nodes (quorum) to avoid split brains.

  • data

    Hold the shards and execute CRUD operations.

  • ingest

    Receive query.

  • coordinator

    Receive client request. Every node is implicitly a coordinating node. Act as smart load balancers.

Cluster management

shard filtering

shard allocation awareness

Logstash

The main role of Logsatsh is to transform (in a centralized place) a stream of data before it is indexed in EL. For some data input such as SQL database it is the only official "supported" way to get the data into EL.

Beats

APM

Category theory

Abstract algebra of function In category theory we never look inside objects. All information about objects is encoded in the arrows (morphisms) between them.

definition

A category is a bunch of objects together with morphisms [9]. The objects have not structure or meaning, actually they only serve to mark the beginning or end of an arrow. Morphisms are direct mappings between these objects [10] that preverse a structure. The structure whatever it is characterizes the category.

  • there must exist a morphism called identity (the zero) that maps an object into itself (e.g: 1A).

  • the morphisms need to compose while respecting associativity:
    h∘(g∘f) == (h∘g)∘f

Example:

In a functional programming language, morphisms/arrows are functions and objects are types.

Purescript

Quick setup

Install
→ nix-env -iA nixos.purescript
→ nix-env -iA nixos.psc-package
→ nix-env -f ~/.config/nixpkgs/pin.nix -iA nodePackages.pulp
Project
→ pulp --psc-package init
→ pulp build --to dist/test.js
→ cat > dist/test.html <<EOF
<!doctype html>
<html>
  <head>
    <title>Test Purescript</title>
    <style>
      body {
        font-family: sans-serif;
        max-width: 570px;
        margin: auto;
      }
    </style>
  </head>
  <body>
    <script src="test.js"></script>
  </body>
</html>
EOF

Tips & tricks

Purescript Haskell

<<<

.


1. This means that if a package provides a bin subdirectory, it’s added to PATH; if it has a include subdirectory, it’s added to GCC’s header search path; and so on
2. If no such files exists, it will default to <nixpkgs>
3. similar to overrridePackages which is only used outside of the special config.nix for specific use cases
4. Each instance implements the same function differently or to say it diffently one function will behave diffently according to the types of its arguments
5. Unfolding is then associated to producers or monads.
6. state monad transformer.
7. a mental model that might help is to look at each filepath as being a list of string not just one string
8. the DNS host for every docker is always 172.17.0.1
9. also called arrows
10. you can have zero, one or many arrows from one object to another