Nix
Nix is not a configuration management tool alike Puppet
, Chef
or Salt
. It is more accurately described as a (universal) package manager.
In that regard, unless you are running nixos
, disnix
(or use other tricks), nix won’t configure systemd services for instance.
Nix only operates on its store (usually located in '/nix') to gather packages called derivations
in nix parlance.
Nix is a radical rethink of the distribution model. It offers:
-
best possible build reproducibility
-
self-contained environments
-
easy rollback
-
composability of derivation
NixOS
Install
-
Switch to azerty keyboard
→ loadkeys be-latin1
-
Partition with gdisk (efi) or fdisk (noefi)
using virtualbox you don’t want/need efi
→ (g/f)disk /dev/sda
Create 2 partitions sda1(83 default)/sda2(82).
[efi] Create an extra (boot) partition with type EF00.
-
Create file system
→ mkfs.ext4 -L nixos /dev/sda1 → mkswap -L swap /dev/sda2
[efi] Choose
vfat
. -
Mount it
→ mount /dev/disk/by-label/nixos /mnt
[efi]
mkdir /mnt/boot
and mount the boot partition in. -
Generate a default config
→ nixos-generate-config --root /mnt
-
Minimal edit the config; don’t forget to uncomment the option 'boot.loader.grub.device'
→ vim /mnt/etc/nixos/configuration.nix
[efi] No edit required.
-
Install
→ nixos-install
-
Reboot
→ reboot
-
Upgrade
→ nixos-rebuild boot --upgrade → reboot
Configuration
Some nix properties are set up in for |
nixpkgs.config.allowUnfree = true;
i18n = {
consoleFont = "Lat2-Terminus16";
consoleKeyMap = "be-latin1";
defaultLocale = "en_US.UTF-8";
} ;
environment.systemPackages = with pkgs; [
asciidoctor (1)
];
# Define a user account. Don't forget to set a password with ‘passwd’.
users.extraUsers.nix = { (2)
createHome = true;
home = "/home/nix";
isSystemUser = true;
extraGroups = [ "wheel" "disk" "vboxusers" "docker"];
shell = "/run/current-system/sw/bin/bash";
uid = 1000;
};
programs.bash.enableCompletion = true;
security.sudo.wheelNeedsPassword = false;
fonts = {
enableFontDir = true;
fonts = [ pkgs.source-code-pro ];
};
nix.extraOptions = ''
gc-keep-outputs = true (3)
gc-keep-derivations = true (3)
'';
virtualisation.docker.enable = true;
virtualisation.docker.extraOptions = "--insecure-registry x.lan --insecure-registry y.lan";
virtualisation.virtualbox.guest.enable = true; (4)
boot.initrd.checkJournalingFS = false; (4)
1 | add packages |
2 | do create a new user ! (root won’t be able to have a chromium session by default) |
3 | prevent too much gc in developer environment |
4 | virtualbox only |
System management
→ sudo nixos-rebuild switch
→ sudo nixos-rebuild boot --upgrade (1)
1 | safer to use boot when upgrading |
Derivation
Nix produces build product by following a two steps phase:
Nix expression (evaluation) → Derivation (realisation) → Build product
The first evaluation step is pure. The produced drv file acts as an intermediate specification for a build that can be freely redistribute to a set of machines.
Derivations are stored in the nix store as follows: /nix/store/hash-name, where the hash uniquely identifies the derivation (not true, it’s a little more complex than this), and name is the name of the derivation.
From a nix language point of view, a derivation is simply a set, with some attributes.
To build a package, nixpkgs
makes heavy usage of stdenv
and its function mkDerivation
:
stdenv.mkDerivation rec {
name = "libfoo-${version}"; (1)
version = "1.2.3"
src = fetchurl {
url = http://example.org/libfoo-1.2.3.tar.bz2;
md5 = "e1ec107956b6ddcb0b8b0679367e9ac9"; (2)
};
builder = ./builder.sh; (3)
buildInputs = [ruby]; (4)
}
1 | mandatory name attr |
2 | mandatory checksum for remote source |
3 | if not provided, the generic builder is used |
4 | additional input required to build the derivation[1] |
The output of a derivation needs to be deterministic. That’s why you can fetch source remotely iff you know the hash beforehand.
- runtime dependencies
-
derivation never specifies runtime dependencies. These are automatically computed by Nix. You can print them with:
nix-store -q --tree $(nix-store -qd $(which cabal2nix))
- overrideDerivation drv f
-
takes a derivation and returns a new derivation in which the attributes of the original are overriden according to the function f. Most of the time, you should prefer
overrideAttrs
.
Channels
A channel is the Nix mechanism for distributing a consistent set of Nix expressions and binaries. nix-channel --add
→ nix-channel --add http://nixos.org/channels/nixpkgs-unstable
→ nix-channel --update
→ nixos-rebuild switch
The unstable channel is usually a few days older from nixpkgs
master.
For a precise status, check here.
You can directly use a derivation from master. For instance, after cloning nixpkgs
, you could type:
→ NIX_PATH=nixpkgs=/home/vagrant/projects/nix/nixpkgs nix-env -f '<nixpkgs>' -iA haskellPackages.stack
|
Nix-shell
When Nix builds a package, it builds it in an isolated environment. It does this by creating a clean, child shell, then adding only the dependencies the package declares. After setting up the dependencies, it runs the build script, moves the built app into the Nix store, and sets up the environment to point to it. Finally, it destroys this child shell.
But we can ask Nix to not destroy the child shell, and instead let us use it for working iteratively on the app. This is what the nix-shell is about: it will build the dependencies of the specified derivation, but not the derivation itself.
nix-shell '<nixpkgs>' -p ruby haskellPackages.stack (1)
1 | p and -A are mutually exclusive |
If a path is not given, nix-shell defaults to shell.nix
if it exists, and default.nix
otherwise.[2]
This allows for a nice trick. We can decribe a virtual dev environment (of any sort for any language) by decribing a derivation in default.nix
like so:
with import <nixpkgs> {};
let henv = haskellPackages.ghcWithPackages (p: with p; [shake]);
in
stdenv.mkDerivation {
name = "haskell-env";
buildInputs = [ henv pythonPackages.pyyaml];
}
nix-shell will use the
|
You can force any script file to run in a nix-shell as such:
#! /usr/bin/env nix-shell
#! nix-shell -i bash
or without a default.nix file:
#! /usr/bin/env nix-shell
#! nix-shell --pure
#! nix-shell -p asciidoctor -p pythonPackages.pygments
#! nix-shell -p "haskellPackages.ghcWithPackages(p: with p; [shake])" (1)
#! nix-shell -i bash
#! /usr/bin/env nix-shell
1 | Double quotes are required. Don’t add -p ghc as you will end up with two different ghcs ! |
In Haskell, we need the --attr env to tell shell.nix
|
Nix-env
nix-env is the command to use to search, install, remove packages locally in user space (or profile). These packages are installed in the nix-store
but are only accessible inside one environment (aka user/profile).
|
-
-q list installed derivations within a profile
-
-qaP list available package with the path
When searching for packages, it is usually more efficient to specify a namespace attribute using the -A
option.
# in nixos:
→ nix-env -qaP -A nixos.haskellPackages
→ nix-env -qaP -A nixos.pythonPackages
# outside nixos:
→ nix-env -qaP -A nixpkgs.pythonPackages
You can also omit the channel namespace and specify the input for nixpkgs
explicitly with the -f
option:
→ nix-env -f '<nixpkgs>' -qaP -A haskellPackages.shake --description
-
-i install derivations
→ nix-env -f '<nixpkgs>' -iA pythonPackages.pyyaml (1) → nix-env -f '<nixpkgs>' -i brackets -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/master.tar.gz’ (2)
1 on nixos, you might use nix-env -iA nixos.pythonPackages.pyyaml
2 install from master directly -
-e erase
→ nix-env -e python2.7-PyYAML-3.11
-
-u update
→ nix-env -u
Nix-build
nix-build tool does two main jobs:
-
nix-instantiate: parse the
.nix
file and return the .drv file (the evaluation step) -
nix-store -r: realise the build product from the input .drv derivation
Language Expressions
- String
-
let h = "Hello"; value = 4; in { helloWorld = "${h} ${toString value} the win!"; (1) }
1 interpolation of the toString
builtin function to convert an int value - List
-
[ 123 ./foo.nix "abc" (f { x = y; }) ]
- Attribute Set
-
let x = 12; y = 34; f = {n}: 5 + n; in rec { r = { inherit x y; (1) text = "Hello"; add = f { n = 56; }; (2) }; sum = r.add + r.y; hello = r.text or "World"; (3) b = r ? x; (4) }
1 when defining a set it is often convenient to copy variables from the surrounding lexical scope 2 all ;
are mandatory3 Sets accessor using .
Default usingor
4 does the record 'r' contains an attribute 'x' ?
- Function
-
pattern: body
# `min` and `max` are available in stdenv.lib min = x: y: if x < y then x else y; (1)
1 pattern is a func returning a func (2 arguments) {stdenv, fetchurl, perl, ... }: (1) stdenv.mkDerivation { (2) name = "hello-2.1.1"; ... };
1 pattern is a set of arguments
the 'ellipsis' (…
) allows the passing of a bigger set, one that contains more than the 3 required arguments.2 function call passing a set as argument - Common functions
-
listToAttrs (1) [ { name = "foo"; value = 123; } { name = "bar"; value = 456; } ]
1 alike fromList from Haskell except there is no tuple type in Nix - With
-
with e1; e2
Introduces all attributes of the set
e1
into the lexical scope of the expressione2
:let as = { x = "foo"; y = "bar"; }; in foobar = with as; x + y
- Optional argument
-
{ x, y ? "foo", z ? "bar" }: z + y + x (1)
1 a function that only requires an attribute named x, but optionally accepts y and z. - Merge sets
-
e1 // e2 # merge e1 and e2 with e2 taking precedence in case of equally named attribute
- Logical implication
-
e1 -> e2 (1)
1 if e1 is false, return true else check that e2 is true otherwise return false (in order word return e2). Useful with assert
Nix modules
A NixOS module is a file that handles one logical aspect of the configuration.
{ config, lib, pkgs, ... }: (1)
{
imports = (2)
[
];
options.services.foo = { (3)
enable = mkOption {
type = types.bool;
default = false;
description = ''
'';
};
...
};
config = mkIf config.services.foo.enable { (4)
environment.systemPackages = [ ... ];
};
}
1 | function declaration with access to the full system configuration and nixpkgs |
2 | paths to other modules that should be included in the evaluation |
3 | options declaration |
4 | option definition |
Tips and tricks
- Customize nixpkgs locally
-
You can override derivation attributes in user space without forking the
nixpkgs
repository. In~/.nixpkgs/config.nix
you typically declare apackageOverrides
function and then useoverride
to customize attributes:~/.nixpkgs/config.nix{ packageOverrides = super: (1) let self = super.pkgs; foo = super.foo.override { barSupport = true ; }; (2) in { inherit foo; haskellPackages = super.haskellPackages.override { overrides = self: super: { (3) language-puppet_1_3_3 = self.callPackage ./pkgs/language-puppet {inherit foo;}; (4) }; }; }
1 packageOverrides takes the original (super) nixpkgs set and return a new (self) record set. [3] 2 call override (defined on much derivations) to changes the arguments passed to it. 3 override the overrides attribute of haskellPackages 4 key = value of the return set - Overlays
-
Since
17.03
there is a more idiomatic way to achieve such local customization:~/.config/nixpgks/overlays/default.nixself: super: let hlib = super.haskell.lib; in { haskellPackages = super.haskellPackages.override { overrides = hpkgs: _hpkgs: { cicd-shell = hlib.dontCheck (hlib.dontHaddock (_hpkgs.callCabal2nix "cicd-shell" (super.fetchgit { (1) url = "http://stash.cirb.lan/scm/cicd/cicd-shell.git"; rev = "d76c532d69e4d01bdaf2c716533d9557371c28ea"; sha256 = "0yval6k6rliw1q79ikj6xxnfz17wdlnjz1428qbv8yfl8692p13h"; }) { protolude = _hpkgs.protolude_0_2; } )); }; }; }
1 callCabal2nix
allows to automatically fetch and build any haskell package from the web - Overrides haskell packages for the
ghc821
compiler -
self: super: let hlib = super.haskell.lib; in { haskell = super.haskell // { packages = super.haskell.packages // { ghc821 = super.haskell.packages.ghc821.override { (1) overrides = hpkgs: _hpkgs: { containers = hlib.dontCheck(_hpkgs.containers); }; };};}; }
1 haskell
equalssuper.haskell
except packages, which equalssuper.haskell.packages
except forghc821
, which is the overriden version ofsuper 821
- Private packages
-
You can also extend
nixpkgs
with private derivations without any forking. For instance using a custom file:dotfiles.nixwith import <nixpkgs> {}; (1) let xmonadEnv = haskellPackages.ghcWithPackages (p: with p; [xmonad xmonad-contrib]); (2) in stdenv.mkDerivation { name = "devbox_dotfiles-0.1"; src = fetchFromGitHub { owner = "CIRB"; repo = "devbox-dotfiles"; rev = "801f66f3c7d657f5648963c60e89743d85133b1a" ; sha256 = "1w4vaqp21dmdd1m5akmzq4c3alabyn0mp94s6lqzzp1qpla0sdx0" ; }; buildInputs = [ xmonadEnv ]; installPhase = '' ${xmonadEnv}/bin/ghc --make .xmonad/xmonad.hs -o .xmonad/xmonad-x86_64-linux (3) cp -R ./. $out (4) ''; meta = { description = "Dot files for the devbox"; }; }
1 dependencies provided by nixpkgs
using $NIX_PATH2 ghc with module deps included 3 at this stage, the shell is inside a temp dir with the src included 4 copy the content of the current dir into $out You then build the derivation or install it in the user environment.
→ nix-build dotfiles.nix → nix-env -f dotfiles.nix -i devbox_dotfiles (1)
1 nix-env -i
takes the name attribute and strip the version (first numeric after-
) - Pinned a version of nixpkgs
-
let nixpkgs = builtins.fromJSON (builtins.readFile ./.nixpkgs.json); in import (fetchTarball { url = "https://github.com/NixOS/nixpkgs/archive/${nixpkgs.rev}.tar.gz"; inherit (nixpkgs) sha256; })
Updating
.nixpkgs.json
is realized with such a zsh function:function updateNixpkgs () { nix-prefetch-git https://github.com/NixOS/nixpkgs.git "$1" > ~/.config/nixpkgs/.nixpkgs.json }
- Caching the list of all available package into a local file
-
nix-env -qaP --description '*' > ~/allpkgs.desc
- Reproduce any hydra build locally
-
bash <(curl https://hydra.nixos.org/build/57055021/reproduce)
Bootstrap
Nix composes all of these individual functions into a large package repository. This repository essentially calls every single top level function, with support for recursive bindings in order to satisfy dependencies. Continuing with the hello example, we may have a top-level entry point like:
rec {
hello = import /path/to/hello.nix { inherit stdenv fetchurl; }; (1)
stdenv = import /path/to/stdenv.nix { inherit gcc };
fetchurl = import /path/to ;
gcc = import /path/to/gcc.nix {};
# ...
}
1 | Import loads a file containing a function and then calls that function with the provided arguments |
But wait - I just said this calls all functions… so wouldn’t that then mean that all software gets installed? The trick here is that Nix is a lazy language.
Ruby
-
Create or copy a Gemfile at the root dir of the project
-
Create a
default.nix
file :
{ bundlerEnv }:
bundlerEnv rec {
name = "xxx-${version}";
version = "4.10.11";
gemdir = ./.;
}
-
Use bundix in the target directory:
$(nix-build '<nixpkgs>' -A bundix --no-out-link)/bin/bundix --magic (1)
1 | magic lock,pack and write dependencies
It will create both a gimset.nix file and a Gemfile.lock
|
Haskell
Concepts
Type class
Type classes are in a sense dual to type declarations. Whereas the latter defines how types are created, type class defines how a set of types are consumed.
When talking about polymorphism, type class enables a form of adhoc polymorphism
or overloading
[4] that needs to be delimited as such to play well with parametric polymorphism
and keeping the type checking sane.
Type class are not first class in Haskell. They cannot be used in place of type (as you would in Java with interface).
It is internally implemented as dictionnary passing: ghc
puts the methods of the instance in a dictionary and passes that implicitly to any functions having a class constraint.
It is best to look at them as a set of constraints on type. One notable drawback is that each type can have at most one implementation of the type class.
Eq, Show, Num, Integral, Ord, Enum are classical examples.
class Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
Using enumFromTo from the Enum type class:
→ enumFromTo 3 8 -> [3,4,5,6,7,8]
→ enumFromTo 'a' 'f' -> "abcdef"
In Scala, type-classes are types themselves, and instances are first class values.
Type Family
data Nat = Zero | Succ Nat
-- Add is a type which is a function on types
type family Add (x :: Nat) (y :: Nat) :: Nat
-- Then comes the implementation of the (type) function
type instance Add Zero y = y
type instance Add (Succ x) y = Succ (Add x y)
Typeable
The Typeable class is used to create runtime type information for arbitrary types:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
data Animal = Cat | Dog deriving Typeable
data Zoo a = Zoo [a] deriving Typeable
example :: TypeRep (1)
example = typeRep (Zoo [Cat, Dog]) (2)
-- Zoo Animal
1 | Runtime representation of the type of the value |
2 | typeRep correspond to typeOf which is kept for backwards-compatibility |
class Typeable a where
typeRep :: Proxy a -> TypeRep (1)
1 | take a type (Proxy) that it never look at |
Typeable
is actually as old as Haskell (before it was even called Haskell …)
Ref/State Primitives
- MVars
-
concurrency primitive, designed for access from multiple threads. It is a box which can be full or empty. If a thread tries to read a value from an empty MVar, it will block until the MVar gets filled (by another thread). Same with full and takeMVar.
- IVar
-
Immutable variable you are only allowed to write to it once.
- STM
-
Retry aborts the transaction and retry it whenever the TVar gets modified.
- IORef
-
Just a reference to some data, a
cell
. Operate in IO. You can think of it like a database, file, or other external data store.atomicModifyIORef
uses CAS (compare and swap implemented at the hardware level) to guarantee the atomicity of read-modify-write kind of operations.
Functor
A functor is a structure-preserving mapping (or homomorphism) between 2 categories.
This means that :
-
for an object
A
in one category, there is a corresponding objectF A
in the second one. -
for a morphism (A → B), there is the corresponding F A → F B
In Haskell, the objects are types and the mappings are functions. Type constructors (* → *) are used to map types into types.
class Functor f where
fmap :: (a -> b) -> f a -> f b
The functor defines the action of an arbitrary function (a → b) on a structure (f a) of elements of type a resulting in the same structure but full of elements of type b.
fmap id = id
fmap (g . h) = fmap g . fmap h
instance Functor ((->) r) where
fmap f g = f . g -- or fmap = (.)
Another intuition is to look at functors as producers of output that can have its type adapted. So Maybe a
represents an output of type a that might be present (Just a) or absent (Nothing). fmap f
allows us to adapt the output of type a to an output of type b.
Whenever you have producer of outputs, you might also have the dual consumer of inputs. This is where Contravariant comes in. The intuition behind a Contravariant is that it reflects a sort of "consumer of input" that can have the type of accepted input adapted.
class Contravariant f where
contramap :: (b -> a) -> f a -> f b
So here we can adapt the input to go from a consumer of input 'a' to a consumer of input 'b'. But to go there you need to provide a function from 'b' to 'a'
Isomorphisms
Category theory allows us to give a precise, abstract (works for all categories) and self-contained definition of an isomorphism:
An arrow/morphism f: A → B is called an isomorphism in C if there is an arrow g that goes from B to A such that:
g ∘ f = 1A and f ∘ g = 1B
Applicative
With a functor f it is not possible to apply a function wrapped by the structure f
to a value wrapped by f. This is given by Applicative:
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
<*> is just function application within a computational context.
As soon as you want to define the type (a → b → c) → f a → f b → f c
you need the applicative construction:
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2 f a b = fmap f a <*> b
It is not that hard to convince yourself that an applicative functor is just a functor that knows how to lift functions of arbitrary arities.
fmap g x = pure g <*> x
Applicative functors are to be preferred to monads when the structure of a computation is fixed a priori. That makes it possible to perform certain kinds of static analysis on applicative values.
Alternative
An Alternative instance gives an applicative functor the structure of a monoid, with empty as the unit element, and <|> as the binary operation.
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
- asum
-
give you the first successful computation or the last zero value. With failures, it really disregards them striving for success. It is defined as:
asum = foldr (<|>) empty
→ asum [Just 1, Just 2, Nothing] -> Just 1 → asum [Left "Failing", Right()] -> Right () → asum [Left "Failing", Left "Failing again"] -> Left "Failing again"
Note that some monad such as ExceptT are appending (using the monoid instance) the error messages (the
Monoid m ⇒ Left m
) when usingasum
ormsum
.
MonadPlus together with mzero , mplus and msum are the monadic equivalents. Since 7.10, all MonadPlus are Alternative (likewise all monads are applicatives).
so you whould avoid using these and prefer empty, (<|>) and asum.
|
Monad
class Applicative m => Monad m where
join :: m (m a) -> m a
(>>=) :: m a -> (a -> m b) -> m b (1)
1 | The signature of bind allows the second computation to depend on the value of the first one. |
Monadic values are produced in a context. Monads provide both substitution (fmap) and renormalization (join).
m >>= f = join (fmap f m)
Even if a monad is strictly more powerful than an Applicative, there are situations for which an applicative is the only valid choice.
Indeed <*>
lets you explore both arguments by pattern matching but with ap
the right hand side cannot be evaluated without the result from the left.
As a stretch while applicative allows for parallelism, monad allows for sequencing.
A monad is like a monoid where we combine functors "vertically".
join
is analogous to(+)
andreturn
to0
.
By law >> = *> . Consequently mapM_ = traverse_ .
|
-
Side-Effect
-
Environment
-
Error
-
Indeterminism
Free
A free construction is a real instance of that construction that hold no extra property. It is the least special possible instance. A free monad is just substitution (fmap) with the minimum amount of renormalization needed to pass the monad laws.
It is perfect to separate syntax (data, ast, parsing) from semantics (interpretation)
The free monad is guaranteed to be the formulation that gives you the most flexibility how to interpret it, since it is purely syntactic.
data Free f a = Pure a | Free (f (Free f a))
The fixed point of a function is generally just the repeated application of that function: fix f = f (f (f (f (f (f (f (f (f (f (f (f (f … )))))))))))) or fix f = f (fix f)
A Monad n is a free Monad for f if every Monad homomorphism from n to another monad m is equivalent to a natural transformation from f to m.
Existential classes
When someone defines a universal type ∀X they’re saying: you can plug in whatever type you want, I don’t need to know anything about the type to do my job, I’ll only refer to it opaquely as X.
When someone defines an existential type ∃X they’re saying: I’ll use whatever type I want here; you won’t know anything about the type, so you can only refer to it opaquely as X
ByteString
-
Word8 is Haskell’s standard representation of a byte
-
ByeString character functions (
Data.ByteString.Char8
) only work with ASCII text, hence the Char8 in the package name → if you are working with unicode, you should use the Text package -
In general we use strict bytestring when you have control about the message. Lazy bytestring is a bit more flexible and used for streaming.
Lazyness
Reduction is done using outermost reduction. For instance:
loop = tail loop
fst (1, loop)
-- innermost reduction gives:
-- fst (1, (tail loop))
-- fst (1, (tail (tail loop))) and never terminates
-- but outermost reduction gives:
-- fst (1, loop) = 1 and terminates
Redex
-- only one redex (2*3) both innermost and outermost
1 + (2 * 3)
-- 2 redexes :
-- (\x -> 1 + x ) (2 * 3) outermost
-- (2 * 3) innermost
(\x -> 1 + x ) (2 * 3)
Mind blowing
instance Monoid r => Monoid (Managed r) where
mempty = pure mempty
mappend = liftA2 mappend
xs = 1 : [x + 1 | x <- xs] --> [1,2,3 ...]
Right cfg -> return . Right . query cfg fp =<< F.newFileCache
UI
-
HsQML (qt 5)
-
SDL2/gl for game
-
Web (ghcjs, threepenny, …)
Pitfall
(++) needs to reconstruct the list on the left !
# ! inefficient !
→ [1..10000] ++ [4]
Useful
-fdefer-type-errors
Fold and Traverse
Introduction
Folding is the act of reducing a structure to a single value.
We can see them as consumers or comonads
.[5]
foldl/foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f v [] = v
foldr f v (x:xs) = f x (foldr f v xs)
foldl :: (s -> a -> s) -> s -> [a] -> s
foldl f v [] = v
foldl f v (x:xs) = foldl f (f v x) xs
Both functions take 3 arguments: - a combining function 'f' - a default value 'v' - the data to be folded.
The default value 'v' deals with the empty element of the list []. For non empty list, foldl1/foldr1
is best suited.
foldr
You can think of foldr
non-recursively as simultaneously replacing each (:) in a list by a given function, and [] by a given value:
foldr (-) 0 (1:2:3:[]) = (1 - ( 2 - (3 - 0))) = (1 - (2 - 3) = (1 - ( -1 ) = 2

Foldr
is handy if f
is not strict in both arguments. That way we can rely on laziness to stop the recursion (or build an infinite list).
Map
for instance has to use foldr
to maintain its laziness capabilities:
map = foldr (\x ys -> f x : ys) []
-- or map' = foldr ((:) . f) []
-- of course it can also be defined with recursion only
map' :: (a -> b) -> [a] -> [b]
map' _ [] = []
map' f (x:xs) = f x : (map f xs)
-- ex
takeWhile (< 12) $ map (*2) [1..]
foldl
On the other hand, when the whole list needs to be traversed (sum
or reverse
are two examples), foldl'
is actually more efficient in term of memory space.
foldl (-) 0 (1:2:3:[]) = (0 - 1) - 2 - 3 = -6
reverse = foldl' (flip (:)) []
The strict version List.foldl'
should always be used instead of the foldl from Prelude. The Foldable
type class comes with foldl
defined strictly.

Foldl package
To get a better representation for fold we need to transform the function into data.
{-# LANGUAGE ExistentialQuantification #-}
-- existential datatype (note that `x` does not appear on the left side)
data Fold a b
-- step func initial acc extract func (done)
= forall x . Fold (x -> a -> x) x (x -> b)
-- expressed as a GADT it would be:
data Fold a b where
Fold :: (r -> b) -> (r -> a -> r) -> r -> Fold a b
Fold
is a functor, a monoid and an applicative.
It is also a profunctor and a comonad.
It is actually isomorphic to a Moore machine (see https://www.fpcomplete.com/school/to-infinity-and-beyond/pick-of-the-week/part-2)
-- | Apply a strict left 'Fold' to a 'Foldable' container
fold :: Foldable f => Fold a b -> f a -> b
fold (Fold step begin done) as = F.foldr cons done as begin
where
cons a k x = k $! step x a
This makes it possible to define cleanly the function average
without traversing twice the foldable container.
average = (/) <$> sum <*> genericLength
sum :: Num a => Fold a a
sum = Fold (+) 0 id
genericLength :: Num b => Fold a b
genericLength = Fold (\n _ -> n + 1) 0 id
λ> fold average [1..10000000]
Fold is also a profunctor and a comonad.
|
Alternative monoid definition
As explained in Gabriel’s beautiful fold talk, Fold can similarly be defined as
data Fold i o = forall m . Monoid m => Fold (i -> m) (m -> o)
This approach can express parallel computation but it won’t encode stateful folds.
FoldM
data FoldM m a b =
-- | @FoldM @ @ step @ @ initial @ @ extract@
forall x . FoldM (x -> a -> m x) (m x) (x -> m b)
Fold
is equivalent to FoldM Identity
.
You use generalize
(with no performance penalty) to get a FoldM
from a Fold
:
generalize :: Monad m => Fold a b -> FoldM m a b
In the turtle library, FoldM plays the role of a consumer and Shell the role of a Producer. fold is how you connect them together.
|
Foldable/Traversable
- Fold
-
fold
from thefoldl
take as an argument anyFoldable
structure.Foldable
are structures that we can reduce into a single result.
class Foldable t where
fold :: Monoid m => t m -> m
foldMap :: Monoid m => (a -> m) -> t a -> m
foldMap g = mconcat . map g
λ> foldMap Sum [1,2,3,4]
Sum {getSum = 10}
fold and foldMap require the elements of the Foldable to be monoids.
|
In Data.Foldable, mapM is defined with foldr (which is kind of mind blowing)
mapM_ :: (Foldable t, Monad m) => (a -> m b) -> t a -> m ()
mapM_ f = foldr ((>>) . f) (return ())
- Traversable
-
When you traverse a structure you actually want to keep it intact. The function
traverse
is exactlymapM
generalized for allFoldable`s
. Traversable applies any applicative effect; traverse is an "effectful" fmap.
class (Functor t, Foldable t) => Traversable t where
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
traverse f = sequenceA . fmap f
mapM = traverse
sequenceA :: Applicative f => t (f a) -> f (t a)
sequenceA = traverse id
for = flip traverse
Lenses
Usage
{-# LANGUAGE TemplateHaskell #-}
data Person = Person
{ _firstname :: String
, _surname :: String
}
-- building lenses for _firstname and _surname
makeLenses ''Person
-- create an HasPerson class with the `firstname` and `surname` optics
makeClassy ''Person
-- create an HasFirstname and HasSurname class
data Person = Person
{ _PersonFirstname :: String
, _PersonSurname :: String
}
makeField ''Person
Lens
A lens is a first-class reference to a subpart of some data type.
Lens' s a
operates on a container s
and put the focus on 'a'.
Lens' s t a b
when you replace a
in s
with b
, its type changes to t
.
Note that lenses are not accessors
but focusers
. It focus on a particular location of a structure. These are the types we want for view
, set
and over/update
:
view :: Lens' s a -> s -> a
set :: Lens' s a -> a -> s -> s
over :: Lens' s a -> (a -> a) -> s -> s
The big insight is the fact that the Lens'
type can be implemented as an unique type that works for all 3 methods (given we add a functor constraint). It is actually a type synonym for:
type Lens' s a = forall f. Functor f => (a -> f a) -> s -> f s (1)
1 | This is a kind of a lifting from the element (a → f a) to the container (s → f s) |
Lenses form a category where . is composition and id is the identity.
|
> over _1 (++ "!!!") ("goal", "the crowd goes wild") > ("goal", "the crowd goes wild") & _1 %~ (++ "!!!") (1) ("goal!!!", "the crowd goes wild") > ("world", "world") & _1 .~ "hello" & _2 .~ "hello" (1)
1 | & allows to start the expression from s and then compose.
It is defined as the reverse of $ operator. |
Common operators
^. |
view |
^? |
preview |
^.. |
toListOf |
.~ |
set |
%~ |
over |
.= |
state monad view |
Traverse
Traversals are Lenses which focus on multiple targets simultaneously. We actually don’t know how many targets they might be focusing on: it could be exactly 1 (like a Lens) or maybe 0 (like a Prism) or several. In that regard, a traversal is a like a Lens' except weaker (more general):
type Traversal' a b =
forall f . (Applicative f) => (b -> f b) -> (a -> f a)
firstOf/lastOf traverse :: Traversable t => t a -> Maybe a
> firstOf traverse [1,2,3]
1
> [1..8] & lastOf traverse
8
toListOf (^..)
-
view list of targets
preview (^?)
-
like
view
for Prism’s or Traversal’s. It handles access that focuses on either 0 or 1 targets.
Prims
Prisms are kind of like Lenses that can fail or miss.
Note how the monoid instance of String allows us to get a native String from this expression:
> s = (Left "hello", 5) > s ^. _1._Left "hello" > s ^. _1._Right ""
But without a monoid instance it cannot work and the (^?)
is necessary:
> s = (Left 5, 5)
> s ^? _1._Left
Just 5
> s ^? _1._Right
Nothing
> :t preview _Right (Right 1)
Num b => Maybe b
Utils
-- create the nested Map when it is missing:
Map.empty & at "hello" . non Map.empty . at "world" ?~ "!!!"
-- > fromList [("hello",fromList [("world","!!!")])]
Monads
Reader
State
The State monad is just an abstraction for a function that takes a state and returns an intermediate value and some new state value:
newtype State s a = State { runState :: s -> (a, s) }
It is commonly used when needing state in a single thread of control. It doesn’t actually use mutable state and so does not necessary operate in IO.
ST
The ST[6] monad lets you use update-in-place, but unlike IO it is escapable. This means it uses system trickery to ensure that mutable data can’t escape the monad; that is, when you run an ST computation you get a pure result.
ST actions have the form:
-- an ST action returning a value of type a in state t
newtype ST s a = ST (Store s -> (a, Store s))
-- a mutable variable in thread s
data STRef s a = STRef (MutVar# s a)
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
The reason ST is interesting is that it’s a primitive monad like IO, allowing computations to perform low-level manipulations on bytearrays and pointers. This means that ST can provide a pure interface while using low-level operations on mutable data, meaning it’s very fast. From the perspective of the program, it’s as if the ST computation runs in a separate thread with thread-local storage.
Exceptions
Types
- Synchronous exceptions
-
Generated as a result of a failing action in IO (same thread). Usually thrown using
throwIO
. - Impure exceptions
-
Thrown in pure code by partial function. Ideally, we would not use such functions, a better practice is to return an
Either
type in this situation. - Asynchronous exceptions
-
Can occur anywhere, including in pure code. Generated when another thread or the runtime system is trying to kill the current thread (via throwTo) or report an “unrecoverable” situation like a StackOverflow.
- Interruptible actions
-
Some operations are interruptible by async exceptions even within a mask. This is the case for blocking functions such as
takeMVar
but also for most I/O operations dealing with the outside world.
Primitives
throwIO :: Exception e => e -> IO a
try :: Exception e => IO a -> IO (Either e a)
catch :: Exception e
=> IO a -- ^ computation
-> (e -> IO a) -- ^ handler
-> IO a
|
finally
:: IO a -- ^ computation
-> IO b -- ^ computation to run afterward even if an exception was raised
-> IO a
a `finally` sequel =
mask $ \restore -> do
r <- restore a `onException` sequel
_ <- sequel
return r
-- | Like 'finally', but only performs the final action if there was an
-- exception raised by the computation.
onException :: IO a -> IO b -> IO a
onException io what =
io `catch` \e -> do _ <- what
throwIO (e :: SomeException)
bracket
:: IO a -- ^ acquire resource
-> (a -> IO b) -- ^ release resource
-> (a -> IO c) -- ^ use resource
-> IO c
bracket before after use =
mask $ \restore -> do
a <- before
r <- restore (use a) `onException` after a
_ <- after a
return r
Monad primitives
The exceptions
package defines Control.Monad.Catch
with
- MonadThrow
-
class Monad m => MonadThrow m where throwM :: Exception e => e -> m a
- MonadCatch
-
class MonadThrow m => MonadCatch m where catch :: Exception e => m a -> (e -> m a) -> m a
- MonadMask
-
class MonadCatch m => MonadMask m where mask :: ((forall a. m a -> m a) -> m b) -> m b uninterruptibleMask :: ((forall a. m a -> m a) -> m b) -> m b
-
Instances should ensure that, in the following code
f ‘finally’ g
, the actiong
is called regardless of what occurs withinf
, including async exceptions. -
ExceptT
is not an instance of MonadMask. See MonadMask vs MonadBracket
-
Shake
Example
shake
avoids rebuilding when it is not necessary. To achieve this goal, it needs to know about files dependencies.
Let’s take as an example, the task of running a test suite.
In the following example you will need to define dependencies in steps as such:
-
You need to display a test report, let’s call it 'build.last'
-
Building
build.last
requires calling an external command for each nodes.
buildDir = "_build"
main = shakeArgs shakeOptions{shakeFiles=buildDir </> "_shake"} $ do
daemon <- liftIO $ initDaemon pref (1)
"test" ~> do (2)
content <- readFile' (buildDir <> "/build.last") (3)
putNormal content
let hasFailure = any (\i -> "x" `isPrefixOf` i) (lines content)
if hasFailure
then fail "The build has failed !"
else liftIO $ putDoc (dullgreen "All green." <> line)
buildDir <> "/build.last" %> \out -> do
Right nx <- liftIO $ runExceptT $ getNodes pdbapi QEmpty
let deps = [ buildDir <> "/" <> Text.unpack n <> ".node" | n <- nx ^.. traverse.nodeInfoName]
need deps
Stdout stdout <- quietly $ cmd ("cat"::String) deps
writeFileChanged out stdout
buildDir <> "//*.node" %> \out -> do
let node = dropDirectory1 (dropExtension out)
facts <- liftIO $ mergeFacts (pref ^. prefFactsDefault) (pref ^. prefFactsOverride) <$> F.puppetDBFacts (Text.pack node) pdbapi
r <- liftIO $ getCatalog daemon (Text.pack node) facts
deps <- liftIO $ Set.fromList .HM.keys <$> getStats (parserStats daemon)
need $ Text.unpack <$> Set.toList deps
case r of
S.Right _ ->
liftIO $ withFile out WriteMode (\h -> hPutDoc h ( dullgreen "✓" <+> text node <> line))
S.Left msg ->
liftIO $ withFile out WriteMode (\h -> hPutDoc h ( char 'x' <> space <> red (text node) <> line <> indent 2 (getError msg) <> line))
1 | Each build would execute this line TODO: Is there a way to avoid this ? |
2 | One of the top target. It is a phony rule because it does not produce anything. <3> |
Turtle
Streams
The Shell type represents a stream of values. You can think of Shell as [] + IO + Managed
.
newtype Shell a = Shell { _foldIO :: forall r . FoldM IO a r -> IO r }
You invoke any external shell commands using proc or shell. If you don’t care about throwing an error instead of returning the error code you would use procs
and shells
; proc(s)
is more secure but it won’t do any shell string interpolation.
shell
:: Text -- Command line
-> Shell Line -- Lines of standard input to feed to program
-> io ExitCode
shells :: Text-> Shell Line -> io ()
select :: [a] -> Shell a
liftIO :: IO a -> Shell a
using :: Managed a -> Shell a
-- usual construction primitive
empty :: Shell a
-- consume the stream by printing it to stdout
view :: Show a => Shell a -> IO ()
stdout :: Shell Text -> IO ()
-- consume the (side-effect) stream, discarding any unused values
sh :: MonadIO io => Shell a -> io ()
You can simulate piping the result of a command with inshell
or inproc
:
inshell :: Text -> Shell Line -> Shell Line
inproc "curl" ["-s"
, "http://"
] empty (1)
& output "filename.ext" (2)
1 | keep the result of a command as a stream |
2 | pipe and copy |
When using inshell you lose the ability to care about the exit code of the command that produces the stream. |
Shell
is also an instance of MonadPlus
(and thus Alternative).
So you can concatenate two Shell streams using <|>
.
Folding
Whenever you peek into the value of a shell stream using ←
you are effectively looping over all values (as the list monad does). For instance this code is bogus :
bogus
|
You will need to consume the stream and one good way to do so is using fold from the foldl
package:
import qualified Control.Foldl as Fold
main = do
not_found <- fold (find (prefix (text "/home/vagrant/zsh")) "/home/vagrant") Fold.null
when (not_found) $ do ...
Similarly here is an utility function that checks if a file is empty:
isFileEmpty :: MonadIO io => FilePath -> io Bool
isFileEmpty path =
fold (input path) Fold.null
FilePath
Turtle
is using the deprecated system-filepath package
to handle filepaths in a more secure way[7]. Watch out as it is at time a bit surprising:
common traps
|
When appending filepath
and text
the best strategy is probably to keep the filepath encoding and then convert to text if necessary:
let path = "foo" </> fromText eclipseVersion </> "plugin"
_path = format fp path
Use </> for appending filepaths, use <> for appending text.
|
Command line options
data Command
= Console
| Stack (Maybe StackName, StackCommand)
deriving (Show)
commandParser :: Parser Command
commandParser =
Console <$ subcommand "console" "Help msg" (pure ())
<|> Stack <$> subcommand "stack" "Help msg" stackParser (1)
1 | remaining parser (after 'stack') |
When using a group you will need a single datatype to extract the value of the rest of the command. Don’t do this:
|
Pipes
Primitives
newtype StateT s m a = StateT {
runStateT :: s -> m (a, s)
}
data Free f a = Free (f (Free f a)) | Pure a
liftF :: Functor f => f a -> Free f a
is the inhabited type and denote a closed output
Proxy
Pipes defines a single type Proxy
which is a monad transformer:
(Proxy p) => p a' a b' b m r Upstream | Downstream +---------+ | | a' <== <== b' | Proxy | a ==> m ==> b | | +----|----+ r
type Effect = Proxy X () () X
runEffect :: (Monad m) => Effect m r -> m r
Effect is a proxy that never yield or wait. The default API exposes a pull-based unidirectional flow.
Producer
A Producer is a monad transformer that extends any base monad with a yield command. yield
emits a value, suspending the current Producer until the value is consumed. If nobody consumes the value (which is possible) then yield never returns.
type Producer b m r = Proxy X () () b m r +---------+ | | Void <== <== () | Proxy | () ==> ==> b | | +---------+
yield :: (Monad m) => b -> Producer' b m ()
for :: (Monad m)
=> Proxy x' x b' b m a'
-> (b -> Proxy x' x c' c m b')
-> Proxy x' x c' c m a'
-- "into" compose the bodies of `for`
(~>) :: (Monad m)
=> (a -> Producer b m r)
-> (b -> Producer c m r)
-> (a -> Producer c m r)
(f ~> g) x = for (f x) g
~> and yield form a Category ("Generator") where yield is the identity.
|
With for
you consume every element of a Producer
the exact same way. If this is not suitable, use next
or a Consumer
.
Think of next
as pattern matching on the head of the Producer. This Either returns a Left if the Producer is done or it returns a Right containing the next value, a, along with the remainder of the Producer:
next :: Monad m => Producer a m r -> m (Either r (a, Producer a m r))
##= Consumer
A consumer represents an "exhaustible" (it may refuse to accept new values) and possibly effectful sink of values. An example of an exhaustible sink is toOutput
from pipes-concurrency
, which will terminate if the Output
it writes to has been sealed.
await
blocks waiting for a new value. If nobody provides it (which is possible) then await never returns.
type Consumer a = Proxy () a () X +---------+ | | () <== <== () | Proxy | a ==> ==> Void | | +---------+
await :: Monad m => Consumer' a m a
Repeatedly feeds await
in the consumer with the action passed as the first parameter.
This allows consumer composition
runEffect $ lift getLine >~ stdoutLn
+- Feed +- Consumer to +- Returns new | action | feed | Effect v v v
(>~) :: Effect m b -> Consumer b m c -> Effect m c
(>~) :: Consumer a m b -> Consumer b m c -> Consumer a m c
(>~) :: Producer y m b -> Pipe b y m c -> Producer y m c
(>~) :: Pipe a y m b -> Pipe b y m c -> Pipe a y m c
(>~) and await form a Category where await is the identity.
|
Pipe
type Pipe a b = Proxy () a () b +---------+ | | () <== <== () | Proxy | a ==> ==> b | | +---------+
(>->) :: Monad m => Producer a m r -> Consumer a m r -> Effect m r
(>->) :: Monad m => Producer a m r -> Pipe a b m r -> Producer b m r
(>->) :: Monad m => Pipe a b m r -> Consumer b m r -> Consumer a m r
(>->) :: Monad m => Pipe a b m r -> Pipe b c m r -> Pipe a c m r
cat :: (Monad m) => Pipe a a m r
cat = forever $ do
x <- await
yield x
(>→) and cat form a Category where cat is the identity.
|
Bidirectional API
yield = respond
for = (//>)
(~>) = (/>/)
await = request ()
Lift
Run StateT
in the base monad of the Proxy passed as a second argument.
runStateP
:: (Monad m)
=> s -- state (usually of type proxy)
-> Proxy a' a b' b (S.StateT s m) r
-> Proxy a' a b' b m (r, s)
-- !! this return a Producer a m (Maybe r, Producer a m r) !!
-- This makes sense you are actually running the StateT monad from Producer a (StateT (Producer a m r) m r) r
-- r is either Just which means the original Producer is empty or Nothing which mean you should go on drawing from the original Producer
-- The top producer accumulates your split, then you have a pair of a Maybe r and your original Producer
runStateP p $ do -- p will be used to feed the underlying proxy
-- entering a monad of the form: (Proxy (<- StateT monad <- Proxy))
-- All computation happens inside the underlying monad that is initially fed up by the param p
x <- lift draw -- lift the next value of the underlying proxy
case x of -- Left if the underlying proxy is empty or Right with the drawn element
Left r -> return (Just r)
Right a -> do
yield a -- push `a onto the top proxy
(Just <$> input) >-> (Nothing <$ takeWhile (== a)) -- start streaming values from the underlying proxy
--
Concurrent API
You have got a mailbox !
(output, input) <- spawn Unbounded
producer >-> (consumer) output >...> input (producer) >-> consumer
Send to the mailbox using toOutput output
(output is able to sent mail). So toOutput
transforms the output into a consumer.
Read from the mailbox using fromInput input
(input is able to receive mail). So fromInput
transforms the input into a producer.
newtype Input a = Input { recv :: S.STM (Maybe a) }
Pipes-Handle
Pipes-handle models the input/output stream analogy. An output stream accepts bytes (you write into it) whereas you read from an inputstream. The proxy that can "read from" in the pipes ecosystem is the consumer. By analogy, an output stream accepts output bytes and sends them to some sink. So you write into an output stream.
Pipes-Parse
Parser is like Consumers but with the ability to keep the leftover
type Parser a m r = forall x . StateT (Producer a m x) m r
draw :: (Monad m) => Parser a m (Maybe a)
runStateT :: Parser a m r -> Producer a m x -> m (r, Producer a m x)
evalStateT :: Parser a m r -> Producer a m x -> m r
execStateT :: Parser a m r -> Producer a m x -> m ( Producer a m x)
Lenses served as transformation in both directions.
splitAt
:: Monad m
=> Int
-> Lens' (Producer a m x) (Producer a m (Producer a m x))
Connect lenses to Parsers
zoom
:: Lens' (Producer a m x) (Producer b m y)
-> Parser b m r
-> Parser a m r
Iso'
: don’t provide them if there is error messages involved in encoding and decoding. Stick to Lens'
Pipes-Group
FreeT nests each subsequent Producer within the return value of the previous Producer so that you cannot access the next Producer until you completely drain the current Producer.
split / transform / join paradigm
-- A "splitter" such as `groupBy`, `chunksOf` or `splitOn`
Producer a m () -> FreeT (Producer a m) m () ~ [a] -> [[a]]
-- A "transformation" such as `takeFree`
FreeT (Producer a m) m () -> FreeT (Producer a m) m () ~ [[a]] -> [[a]]
-- A "joiner" such as `concat` or `intercalate`
FreeT (Producer a m) m () -> Producer a m () ~ [[a]] -> [a]
Errors management
Empty Bytestring
If you want to transform a Producer of ByteString into another Producer, for instance of csv records, be careful to be immune of empty bytestring chunks.
Indeed
|
Managed
You have a resource a that can be acquired and then released.
-- | A @(Managed a)@ is a resource @(a)@ bracketed by acquisition and release
newtype Managed a = Manage
{ -- | Consume a managed resource
with :: forall x . (a -> IO x) -> IO x
}
Resource ((forall b. IO b -> IO b) -> IO (Allocated a))
Arrows and push based pipe
Events are discrete ← PUSH based.
Behaviors are continuous ← PULL based
ArrowChoice
corresponds to concurrency and Arrow
corresponds to parallelism
Controller/Model/View
Represent concurrent effectful inputs to your system. A controller
is really just a synonym for an Input
from pipes-concurrency
. So you have this function:
producer :: Buffer a -> Producer a IO () -> Managed (Controller a)
A pure streaming transformation from the combined controller to the combined views. You can test this pure kernel by swapping out controllers with predicable inputs.
asPipe :: Pipe a b (State s) () -> Model s a b
Handles all effectful outputs from the model.
asSink :: (a -> IO ()) -> View aa
runMVC ::
initialState
-> Model s a b
-> Managed (View b, Controller a)
-> IO s
Questions
type Producer b = Proxy Void () () b
type Producer' b m r = forall x' x . Proxy x' x () b m r
Resources
Dhall
Dhall is a programming language specialized for configuration files.
let double = \(n : Natural) -> n * 2 in double 4
{ userName = ""
, userEmail = ""
, userStacks = ["bos", "irisbox"]
, plugins = True
, mrRepoUrl = "git://github.com/CIRB/vcsh_mr_template.git"
}
{-# LANGUAGE DeriveGeneric #-}
data BoxConfig
= BoxConfig
{ _userName :: Text
, _userEmail :: Text
, _repos :: Vector Text (1)
, _eclipse :: Bool
} deriving (Generic, Show)
makeLenses ''BoxConfig
instance Interpret BoxConfig
1 | Dhall uses vector instead of list |
main :: IO ()
main = do
box_config <- Dhall.input auto "./config/box"
configure (box_config^.userName) (box_config^.userEmail)
#! /usr/bin/env bash
readarray arr <<< $(dhall <<< '(./config/box ).repos' 2> /dev/null | jq -r 'join (" ")')
for s in ${arr}; do
echo "$s"
done
Naming convention
Algebraic Data Type
Quite common (used in pipes
, ekmett
, servant
, tibbe
):
data List a
= Cons a (List a)
| Nil
Also used for simple sum
or product
declaration:
data MySum = A | B
data MyProduct = MyProduct Int String
Record
The most common (used in the lens
, fpcomplete
, servant
, tibbe
, hindent
):
data Person = Person
{ _firstName :: String -- ^ First name
, _lastName :: String -- ^ Last name
} deriving (Eq, Show)
In order to mimic ADT and to make it easy with the haskell-indentation we could go with this instead (but it is less common !):
data Person
= Person
{ _firstName :: String
, _lastName :: String
} deriving (Eq, Show)
Module
module Puppet.Parser (
expression
, puppetParser
, runPParser
) where
Idioms
Maybe
Use of a case
to pattern match a maybe
value is quite common:
readline >>= \case
Just "Y" -> pure ()
_ -> die "Abort"
You might want to define a unwrapWith
utility mimicking rust
unwrap_with
but it would be limited and unpractical:
-- | Unwrap a maybe value in an io computation
-- passing an alert action in case of Nothing
unwrapWith :: MonadIO io => io a -> Maybe a -> io a (1)
unwrapWith io_alert v = maybe io_alert pure v
<1> Note how `a` fixes the input/output
At the end of the day it is better to stick with the 'case pattern-matching' idiom even for simple cases and avoid the less readable maybe
variant:
readline >>= \case
Nothing -> die "Abort"
Just v -> pure v
readline >>= maybe (die "Abort") pure (1)
1 | shorter but arguably more cryptic |
Extensions
- Syntax
-
-
LambdaCase
-
GADTSyntax
-
RankNTypes
-
BangPatterns
-
RecordWildCards
-
DuplicateRecordFields
-
MultiWayIf
-
- Common
-
-
ScopedTypeVariables (69)
-
DeriveGeneric (66)
-
TupleSections (59)
-
MultiParamTypeClasses (56)
-
FlexibleInstances (33)
-
FlexibleContexts (31)
-
TypeOperators (29)
-
FunctionalDependencies (15)
-
- Unknown
-
-
MonadComprehensions (12)
-
EmptyCase (3)
-
DisambiguateRecordFields (14)
-
RecursiveDo (10)
-
ParallelListComp
-
TypeFamilies,
-
PatternSynonyms
-
PartialTypeSignatures
-
TypeApplications
-
Common functions
-- give a default and always get an a from a maybe value
maybe:: a -> Maybe a -> a
GATS
GADTs let you associate different type parameters for different data constructors of a type.
For example, imagine we represent simple language terms that can only be bool/int literals and And/Add operations between those:
data Expr = ExprInt Int
| ExprBool Bool
| ExprAnd Expr Expr
| ExprAdd Expr Expr
This would let us do invalid things like:
ExprAnd (ExprInt 5) (ExprBool True)
Firstly and less importantly, GADTs let us write the same definition using a different notation:
data Expr where
ExprInt :: Int -> Expr
ExprBool :: Bool -> Expr
ExprAnd :: Expr -> Expr -> Expr
ExprAdd :: Expr -> Expr -> Expr
The real point of this notation is that it is an opportunity to associate different constructors of Expr
with different type constraints and type parameters:
So you restrict the return value of the
data Expr :: * -> * where
ExprInt :: Int -> Expr Int
ExprBool :: Bool -> Expr Bool
ExprAnd :: Expr Bool -> Expr Bool -> Expr Bool
ExprAdd :: Expr Int -> Expr Int -> Expr Int
This rules out non-sensical terms like:
ExprAnd (ExprInt 5) (ExprBool True)
Additionally, GADTs let you add type-class constraints and forall’d variables to each of the constructors. For example, let’s say we want to represent a length-indexed list:
data LenList :: Nat -> * -> * where
Nil :: LenList 0 a
Cons :: a -> LenList n a -> LenList (1 + n) a
Note that not only do the 2 differing constructors have differing type params (0/1+n), they also have constraints linking the "n" from the "LenList" type index (aka type parameter) to the "n" of the given list.
Another important facet of GADTs is that all this extra information is not just used to type-check value constructions as shown above. It also gives you back type information when you do case analysis. i.e:
case myLenList of
Nil -> ... -- ^ the type of myLenList in this case is inferred to (LenList 0 a)
Cons x xs -> ... -- ^ the type of myLenList in this case is inferred to
(LenList (1 + n) a) and the type of xs is inferred to (LenList n a)
f To reiterate, the type of the term we're case analyzing is inferred differently according to runtime values (which constructor is chosen).
Lastly, by allowing interesting types and constraints on each constructor, GADTs implicitly allow existential quantification, and storing of type-class instances inside values.
For example, this existentially quantified (and mostly useless) type:
data SomeShowable = forall a. Show a => MkSomeShowable a
Can be represented with GADTs as:
data SomeShowable where MkSomeShowable :: Show a => a -> SomeShowable
Note the forall a.
can be left implicit in the GADT version.
Interestingly, with GADTs, you can have existential quantification only in some of your constructors. You can have differing type-class instances stored inside different constructors. When you pattern-match your GADT constructor, the instance implicitly comes into scope.
Operational
think of monads as sequences of primitive instructions.
Operator colloquial name
>>= |
bind |
|
>> |
then |
|
*> |
then |
|
→ |
to |
a → b: a to b |
← |
bind |
(as it desugars to >>=) |
<$> |
(f)map |
|
<$ |
map-replace by |
0 <$ f: "f map-replace by 0" |
<*> |
ap(ply) |
|
$ |
apply to or of |
|
. |
after |
a . b $ c: "a after b applied to c" |
!! |
index |
|
! |
index, strict |
a ! b: "a index b", foo !x: foo strict x |
<|> |
or, appbin |
expr <|> term: "expr or term" |
++ |
append |
|
[] |
empty list |
|
: |
cons |
|
:: |
of type |
f x :: Int: f x of type Int |
\ |
lambda |
|
@ |
as |
go ll@(l:ls): go ll as l cons ls |
~ |
lazy |
go ~(a,b): go lazy pair a, b |
>=> |
fish |
|
<=< |
left fish |
Developments
hasktags -e src
Blast
Two main ideas:
-
build two 'isomorphic' AST for the slave and the master. The nodes (shape) is the same but each node is slighlty different in the master and slaves
-
find a way so that the execution of
f x
can be done in a typed safed way on the slaves using the cache -
you know from the start how many slaves you have
- Kubernetes
-
Chaque pod dans Kubernetes possède une adresse IP unique → One slave == one pod
One slave is failing, you get another one inside the pod → same IP
Spacemacs
Cheatsheet
Magit
<SPC> g e |
Ediff a file (awesome) |
<SPC> g d r |
Revert hunk |
Helm
c-k |
"abort" helm completion !! |
Ido
c-Return |
dired |
c-o |
| open in a new buffer |
c-s |
- open in a new buffer |
Search
<SPC> s l |
helm-semantic-or-imenu |
<SPC> s w g |
search with google |
<SPC> s f |
search within a path |
<SPC> s s |
helm-swoop |
Misc
<SPC> v |
expand region mode |
<SPC> a u |
"undo-tree-visualize" |
<SPC> p f |
helm-find file for your project !amazing! |
<SPC> f e h |
open helm spacemacs for help |
<SPC> f S |
save all buffers |
<SPC> f y |
show file name |
<SPC> i K |
insert empty line above |
<SPC> b b |
helm find buffer |
<SPC> x J / SPC x K |
move lines up and down i |
<SPC> r y |
kill ring |
,gg |
jump to def (awesome!) |
<SPC> p y |
find tags |
<SPC> / |
ag search |
<SPC> m c R |
reload .spacemacs |
<SPC> f e d |
open .spacemacs |
<SPC> n r |
narrow region |
(define-key evil-normal-state-map (kbd "C-p C-b") 'ibuffer)
Surround
Enter visual state with v
>> e
for selecting expression >> s
for surrounding >> )
with parents without extra spaces.
Tramp
/ssh:saltmaster_testing|sudo:root@saltmaster:/srv/myfile.sls
Replace on multiple files
Docker
Intro
Containers work by isolating the differences between applications inside the container so that everything outside the container can be standardized.
At the core of container technology are cGroups
and namespaces
. Control groups work by allowing the host to share and limit the resources each process or container can consume. Processes are limited
to see only the process ID in the same namespace.
A Docker environment is made up of filesystems layered over each other. At the base is a boot filesystem, docker’s next layer is the root filesystem, rootfs. Then Docker takes advantage of a union mount to add more readonly filesystems on top. These filesystems are called "images". Finally, Docker mounts a read-write filesystem on top of any layers below. This is where whatever process we want our Docker container to run will execute.
User images are named using "initial/name:tag"
The RUN
instruction in a Dockerfile
executes commands on the current image and commits the results.
Useful command
docker build -t initial/name .
docker commit containerid imagename
docker ps
docker images
docker run -i -t initial/name /bin/bash
docker run -d --net compose_default puppet/puppet-agent-centos (1) (2)
docker exec (3)
1 | -d for detached (will run in the background) |
2 | --net compose_default specify the network (this one is created by default by docker-compose)
<3> |
Link
Links is used to enable secure communication between two containers. The first container is oddly enough called the child. This is odd because it is usually a server and it has to be started first … The first container will expose a port and be labelled with a name.
# Child or first container
sudo docker run -i -t -h puppet -name puppetmaster pra/pmaster /bin/bash
# Parent or second container have all info to connect to the first
sudo docker run -i -t -h minion -name minion -link puppetmaster:puppet pra/minion /bin/bash
SSH-tunnel
ssh -q -M -S my-ctrl-socket -fnNT -L 27017:localhost:27017 alhazen@pulp.irisnet.be
# to use the host network: --net host
docker run --net host -e PULP_LOGIN=$(PULP_LOGIN) -e PULP_PWD=$(PULP_PWD) --rm -v $(PWD):/code -ti test /code/bin/clean.py $(ENV) --repo-name=$(REPO_ID)
ssh -q -S my-ctrl-socket -O exit alhazen@pulp.irisnet.be 2> /dev/null
Export/Import
Export acts on containers ! It currently does not work from containers to images … It is really briddle right now (just wait for 1.0)
In the meanwhile it is possible to use any image as your base image in the Dockfile …
Mount
You cannot mount a host dir with the VOLUME instruction inside the Dockerfile. You need to pass it at runtime :
# !! first -v, then -t !!
run -it -v /media/puppet-stack-middleware:/etc/puppet/environments/middleware_local:ro pra/puppetmaster /bin/bash
docker run -it -v /media/puppet-stack-middleware:/etc/puppet/environments/middleware_local:ro pra/puppetmaster /bin/bash
Initial Win7 host setup
Win7 hosts a docker ubuntu VM (standard install) using vagrant.
Change the Vagrantfile to mount the shared `puppet-stack-middleware`directory:
config.vm.share_folder "puppet-stack-middleware", "/media/puppet-stack-middleware", "C:/Users/pradermecker/VirtualBox VMs/shared/puppet-stack-middleware"
Connection to the docker vms from an arch vms with:
ssh -p 2222 vagrant@10.0.2.2
Create a dir puppetmaster
and a file inside called Dockerfile
. Build with sudo docker build .
Then you need to ssh-copy-id your public id_rsa.pub key to be able to fetch the Docker configuration from Github.
Trouble Shouting
- WARNING
-
In centos
6.4
usePAM
needs to be set tono
while it needs to be set toyes
in6.5
- WARNING
-
The Centos latest official images, currently 6.5, comes with a broken
centos.plus
version oflibselinux
. To remove it you need to:
yum downgrade --skip-broken libselinux libselinux-utils
Docker compose
Swarm node
Each node is configured by puppet and contain:
-
a container swarm running inside a docker (spawn with the docker engine daemon)
-
a docker registrator running inside a docker (spawn with the docker engine daemon)
-
a consult agent (doesn’t run within a docker)
DNS
You can use Consul
as a DNS service. dnsmask
is configured within each swarm node while every dockers inside a node is running with --dns 172.17.0.1
.[8]
Salt
Targeting
Minion id
-
unique (FQDN by default)
-
can be overridden in the minion config file
-
if changed, P/P keys need to be regenerated
-
match by shell-style globbing around the minion id or top file
-
use single quote
-
Perl-compatible regex can be used with the -E option
salt '*.be.brussels' test.ping
salt -L 'web1,web2,web3' disk.usage
salt -E 'web[0-9]' cmd.exec_code python 'import sys; print sys.version'
base:
'web-(develjstaging)'
- match: pcre
- webserver
Grains
-
static bits of information that a minion collects when the minion starts
-
can be statically described in the minion config file with the option grains
-
available to Salt modules
-
automatically sync when state.highstate is called.
salt -G 'os:CentOS' --batch-size 25% grains.item num_cpus
Node groups
-
predefined group of minions declared in the master
-
declared using compound matchers (see doc)
Salt states
Use SLS files (SaLt State) to represent the state of a system.
-
SLS files are just dictionaries, lists, strings, and numbers (HighState data structure)
-
default serialization format is YAML with the Jinja2 templating system
-
system data and function can be used via salt, grain and pillar
-
files are combined to form a salt state tree using source, include and extend
declaration-id: (1)
pkg:
- installed
service:
- running
- watch: (2)
- pkg: apache
- file: /etc/httpd/conf/httpd.conf
/etc/httpd/conf/httpd.conf:
file.managed:
- source: salt://apache/httpd.conf
- user: root
- group: root
- mode: 644
1 | declaration-id set the name of the thing that needs to be manipulated |
2 | watch & require to manage order and events |
# given a sls web/apache.sls
salt '*' state.sls web.apache
Salt file server & top file & environment
The top file is used to map what modules get loaded onto what minions
base: (1)
'bmob': (2)
- packages (3)
1 | environment |
2 | target for state.highstate |
3 | sls module name |
The file server is suitable for distributing files to minions
file_roots:
base:
- /srv/salt
External Auth
# The external auth system
external_auth:
ldap:
GP_APP_JENKINS%:
- 'test.*'
- 'grains.*'
- 'pillar.*'
pradermecker:
- 'G@hostname:middleware': (1)
- '.*'
- '@runner' (2)
- '@wheel'
- '@jobs'
jfroche:
- 'saltutil.*'
- '@runner'
- '@wheel'
- '@jobs'
auth.ldap.basedn: OU=ACCOUNTS,OU=CIRB-CIBG,DC=ad,DC=cirb,DC=lan
auth.ldap.binddn: CN=<%= @ldap_name %>,OU=Saltmasters,OU=Apps,OU=Service_Groups_Accounts,OU=ACCOUNTS,OU=CIRB-CIBG,DC=ad,DC=cirb,DC=lan
auth.ldap.bindpw: <%= @ldap_pwd %>
auth.ldap.filter: (sAMAccountName={{username}})
auth.ldap.port: 389
auth.ldap.server: svidscavw003.prd.srv.cirb.lan
auth.ldap.tls: False
auth.ldap.no_verify: True
auth.ldap.activedirectory: True
auth.ldap.groupclass: group
auth.ldap.accountattributename: sAMAccountName
auth.ldap.persontype: person
1 | Define the allow targets (compount). No relation to the salt notion of environment. |
2 | Access to the runner module but this work only via the salt-api
On the command line, salt-run does not support the pam or ldap flag. |
Standalone minions
Minion can run without master.
In the minion config file, set the option file client: local
By default the contents of the master configuration file are loaded into pillar for all minions, this is to enable the master configuration file to be used for global configuration of minions. To disable the master config from being added to the pillar set pillar_opts to False.
Master Event
event = salt.utils.event.MasterEvent('/home/vagrant/projects/jules/var/run/salt/master')
event.get_event(wait=20, tag='salt')
Pillars
The data can be arbitrary. The pillar is built in a similar fashion as the state tree, it is comprised of sls files and has a top file, just like the state tree. The default location for the pillar is in /srv/pillar ("pillar_roots" master config key).
GITFS
When using the gitfs backend, Salt translates git branches and tags into environments, making environment management very simple.
fileserver_backend:
- git
gitfs_remotes:
- http://stash.cirb.lan/scm/middleware/salt-stack.git
Salt API
curl -si salt.sta.srv.cirb.lan:8000/login \
-H "Accept: application/json" \
-d username='jfroche' \
-d password='xMLrzzzz' \
-d eauth='pam' > /tmp/cookies.txt
curl -b /tmp/cookies.txt -si salt.sta.srv.cirb.lan:8000 \
-d client='runner' \
-d mods='orchestration.bootstrap-puppet' \
-d fun='state.orchestrate' \
-d eauth='pam'
curl -ssik https://salt.sta.srv.cirb.lan:8000/run \
-H 'content-type: application/json' -H 'Accept: application/x-yaml' -d '[{
"username": "xxx",
"password": "xxxxxx",
"eauth": "ldap",
"client": "runner",
"fun": "doc.execution"
}]'
Orchestration
[main]
SALTAPI_URL=http://saltmaster.sandbox.srv.cirb.lan:8000
SALTAPI_USER=pradermecker
SALTAPI_PASS=pass
SALTAPI_EAUTH=pam
salt-run state.orchestrate orch.test saltenv=middleware (1)
pepper '*' test.ping
pepper 'puppetmaster2*' grains.item subgroup role
pepper --client=runner state.orchestrate mods=orchestration.bootstrap-puppet
1 | pick up the gitfs branch that host orch.test source |
set_puppet_role_to_master:
salt.function:
- name: utils.set_role
- tgt: 'G@role:server and G@subgroup:puppet'
- kwarg:
role: master
- require:
- salt: run_saltmaster
# /srv/salt/orch/test-puppet.sls
run_puppet_jenkinsmaster:
salt.state: (3)
- sls:
- puppet (4)
- tgt: 'G@role:master and G@subgroup:jenkins'
- tgt_type: compound
ping_saltmaster:
salt.function: (1)
- name: test.ping
- tgt: 'role:saltmaster'
- tgt_type: grain
- require: (2)
- salt: run_puppet_jenkinsmaster
# /srv/salt/puppet.sls:
puppet:
module.run:
- name: cmd.run
- arg:
- 'puppet agent --verbose --onetime --no-daemonize --color false'
1 | To execute a function, use salt.function |
2 | Force order |
3 | To execute a module, use salt.state |
4 | Execute the module /srv/salt/puppet.sls |
Salt SSL
make salt-ssh HOST=jenkins2 ZONE=prod CMD="state.sls utils.migrate_puppet3"
Useful commands
salt '*' saltutil.sync_all
pep 'svappcavl704.dev.srv.cirb.lan' cmd.run "cat /etc/salt/master" | jq '.return[]' | jq -r '.[]'
pep 'svappcsvl028.prd.srv.cirb.lan' cmd.run "cat /etc/salt/master" | jq '.return[]' | jq -r '.[]'
Postgrest
http://pgserver.sandbox.srv.cirb.lan:3000/jids?jid=eq.20150831150415858891
http://pgserver.sandbox.srv.cirb.lan:3000/salt_returns?full_ret->>jid=eq.20150831150437889173
Install PRD / Bootstrap
## get salt/puppet version we want
## We do need to update puppet because the current salt config does not work wih < 3.8
yum versionlock delete 0:*
yum install salt-master salt-minion puppet
# temp /etc/hosts to point to the new salt master
systemctl start salt-master
systemctl start salt-minion
salt '*' saltutil.sync_all
## we need to manually change the config of /etc/salt/master:
#
# file_roots:
# base:
# - /srv/salt/
# middleware:
# - /srv/salt/middleware
## new puppetmaster, foreman, puppetdb, pgserver
# temp /etc/hosts to point to the new salt master
# we still need to manually
yum makecache fast
yum update -y
yum clean all
# we still need to manually
mkdir -p /etc/facter/facts.d/
vim /etc/facter/facts.d/host-info.txt
# and finally we need piera to get hiera data before we can bootstrap ...
## Do test every pings are working correctly
salt-run state.orchestrate orch.ping saltenv=middleware
## There are issues when puppetconfig restart the minion during the orchestration process
## Let's do it manually
salt -C 'G@role:master and G@subgroup:puppet and G@hostgroup:middleware' puppetutils.run_apply hostgroup=middleware role=server zone=prod subgroup=puppet
salt -C 'G@role:saltmaster and G@hostgroup:middleware and G@zone:prod' puppetutils.install_stackrpm hostgroup=middleware zone=prod
salt -C 'G@role:saltmaster and G@hostgroup:middleware and G@zone:prod' puppetutils.run_apply hostgroup=middleware role=saltmaster zone=prod
salt -C 'G@role:pgserver and G@hostgroup:middleware and G@zone:prod' puppetutils.run_agent hostgroup=middleware zone=prod
Issues
-
When the master restart, windows minion does not seem to be able to reconnect (without a minion restart) /etc/httpd/conf/httpd.conf:
Git
Internals
Git maintained snapshot of directory’s contents. It is a content-addressable filesystem, a simple key value data store. Keys are SHA-1 hash and values are objects.
There are 4 different types of objects:
-
Blob stores files (it does not store the name of the file)
-
Tree references other trees and/or blobs, stores the file name and groups them together (as directories do)
-
Commit points to a single tree and realize "snapshots".
-
Tag marks a specific commit
Tips
- Ignore files in all projects but keep this for yourself
-
-
Add to your ~/.gitconfig file
``` [core] excludesfile = /home/username/.gitignore ```
-
Create a ~/.gitignore file with file patterns to be ignored
-
- Delete a range of tags both locally and remotely
-
for i in `seq 248 638`; do git tag -d $i; git push --delete origin $i;done
Linux
Script
#!/bin/bash -xe
set -euo pipefail (1)
while read host
do
cat plone_team.pub | ssh alhazen@"$host".prd.srv.cirb.lan "grep plone ~/.ssh/authorized_keys || cat >> ~/.ssh/authorized_keys"
done < plone-prod-uniq.txt
1 | exits immediately when a command fails (e ) even within a pipe (o pipefail ), treat unset variables as an error (u ) |
pandoc --latex-engine=xelatex -o blog.pdf http://blog.jakubarnold.cz/2014/07/22/building-monad-transformers-part-1.html
LVM
-
change disk size on the VCloud
-
create a new partition with fdisk (ie: sdb1) so we don’t change anything on the existing partition table
-
add this new partition as a new physical volume:
pvcreate /dev/sdb1
-
vgextend system_vg /dev/sdb1
-
lvextend -L+12G /dev/system_vg/data
-
xfs_growfs /dev/system_vg/data
or by adding a new disk using puppet :
-
add a new disk on the VCloud
-
after a few delay, VCloud will automatically create a new partition for instance '/dev/sdd'
-
add this new partition as a new physical volume:
pvcreate /dev/sdd
. You bb can see it withpvs
-
vgextend vg_rhel /dev/sdd
(the name to 'vg_rhel' is fixed for our new RHEL 7 template) -
puppet agent -t
will now create a new lvnix
. You can see it withlvs
at the CIRB the easier is:
-
to ask for a machine with 40G (second disk usually /dev/sdb)
-
The machine will be received with a full
vg_rhel
of 40G. Go to the vcloud console and extend the second disk to 60G -
The machine now has a /dev/sdb disk with 60G. Extends the pv using
pvresize -v /dev/sdb
. And check withvgdisplay or pvs
.
Tips
- Add route in windows
-
route ADD 192.168.30.0 MASK 255.255.255.0 10.255.10.4
- SCP copy from local to remote
-
scp -i ~/.ssh/user_rsa -r folder user@svifscapl003.prd.srv.cirb.lan:/tmp
- SCP copy from remote to remote
-
Using your local computer
ssh-add ~/.ssh/alhazen_rsa # Give alhazen the permission to write on targetfqdn:/srv/tmp ssh -A -i ~/.ssh/alhazen_rsa alhazen@sourcefqdn \ "scp -o StrictHostKeyChecking=No /srv/data/pgserver.dump alhazen@targetfqdn:/srv/tmp"
- SSH with password for a specific host
-
ssh/config
Host 192.168.xx.xx PreferredAuthentications password
- NsLookup
$ nslookup.exe stash.cirb.lan 192.168.34.2xx (1)
Non-authoritative answer:
Server: svidscapw000.ad.cirb.lan
Address: 192.168.34.2xx
Name: stash.cirb.lan
Address: 192.168.34.xx
1 | DNS to lookup + DNS server |
Definitions
- Push (SSE) vs Pull (REQ/REP)
- Application layer
-
HTTP, SNMP, AMQP, XMPP, IRC, DHCP, WebDAV, SSH, FTP, SIP, Telnet
- Transport layer
-
TCP, UDP (SCTP)
Logs
journalctl -r - show logs in reverse order
journalctl -b - show logs since last boot
journalctl -k - show kernel logs
journalctl -p warning - show logs with warning priority
journalctl -p error - show logs with error priority
journalctl --since=2016-08-01 - show logs since
journalctl --until=2016-08-03 - show logs until
journalctl --until=today - show logs until midnight today
journalctl --since=yesterday - show logs since yesterday midnight
journalctl --since=-2week - show logs for last 2 weeks
journalctl -u <unit-name> - show logs of certain unit
journalctl /dev/sda - show kernel message of device
journalctl -o json - show logs in json format
Postgres
General
Glossary
- PEM
-
Postgres Enterprise Manager
- PPAS
-
Postgres Plus Advanced Server
- WAL
-
At all times, PostgreSQL maintains a WAL (Write Ahead Log) in the pg_xlog/ subdirectory of the cluster’s data directory. The log describes every change made to the database’s data files. This log exists primarily for crash-safety purposes: if the system crashes, the database can be restored to consistency by "replaying" the log entries made since the last checkpoint. However, the existence of the log makes it possible to use a third strategy for backing up databases: we can combine a file-system-level backup with backup of the WAL files. If recovery is needed, we restore the backup and then replay from the backed-up WAL files to bring the backup up to current time.
Architecture
One process per user, NO THREAD ! Processes are managed by Postmaster that acts as a listener for new connection and as a supervisor to restart them.
The term "buffer" is usually used for blocks in memory.
7500 concurrent user → connection pooling
16MB
Cluster
A cluster is a collection of databases. Clusters have separate: * data directory * TCP port * set of processes
To create a cluster execute the following comming with the postgres user (! not root !): [postgres]$ initdb --locale en_US.UTF-8 -E UTF8 -D '/var/lib/postgres/data'
To create a second cluster on the same machine you need to: * as root, create a DATA directory * as root, change owner of the DATA directory to enterprisedb or postgres (depending on the version of postgres enterprise or community) * as postgres (or enterprisedb), do: initdb -D '/var/lib/postgres/data'
There is a little tricky behavior with the second cluster when you want to connect with a client. By default, connections will be refused for user "enterprisedb" … You need to change the pg_hba file and set "trust" for enterprisedb … Then set a password with the client and put it pack to md5.
host all enterprisedb 192.168.104.0/24 trust
NixOS
services.postgresql = {
enable = true;
authentication = ''
local saltstack all trust
'';
};
CREATE USER vagrant SUPERSUSER LOGIN; (1)
CREATE USER salt LOGIN; (2)
CREATE DATABASE saltstack WITH owner = 'salt';
ALTER USER salt WITH password 'saltpass';
psql saltstack -U salt (3)
1 | as root |
2 | as vagrant |
3 | as vagrant, check that you can connect to the db |
Tips
- Allocate a multiple records select in a variable in psql
CREATE OR REPLACE FUNCTION notify_result() RETURNS TRIGGER AS $$
DECLARE
notification jsonb;
chan text;
BEGIN
-- Get the user as the name of the channel
SELECT load->>'user' into chan from jids where jid = NEW.jid;
-- This is not working because salt_returns table haven't been filled in yet ...
notification := (SELECT array_to_json(array_agg(row_to_json(t))) from (SELECT r.full_ret FROM salt_returns r where r.jid = NEW.jid) t);
-- Execute pg_notify(channel, notification)
PERFORM pg_notify(chan, NEW.jid);
-- Result is ignored since this is an AFTER trigger
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS notify_result on jids;
CREATE TRIGGER notify_result AFTER INSERT ON jids
FOR EACH ROW EXECUTE PROCEDURE notify_result();
- COPY
COPY edbstore.categories TO '/tmp/test.csv' WITH (FORMAT 'csv');
- Backup & Recovery
-
for small DB, we can use a sql dump several time a day:
pg_dump dbname | gzip > filename.gz
pg_dump dbname | split -b 1m - filename
psql dbname < infile
- Shell
psql -h 192.168.14.62 -W -U pradermecker postgres < create_PGPUPPETDB.sql
export PGPASSWORD=dbpasswordforpuppetdb
ssh puppetmaster-prod 'sudo -u postgres pg_dump puppetdb ' | psql -h 192.168.14.62 -U puppetdb -w PGPUPPETDB
for t in $(pqsl -U enterprisedb -d edbstore -t -c "select tablename from pg_tables where tableowner='edbstore'"); do
pg_dump -t edbstore.$t -U enterprisedb edbstore > $t.sql;
done
select tablename from pg_tables where tableowner='edbstore';
select table_name from information_schema.tables where table_schema='edbstore';
Replication
-
Hot Streaming Replication (Warm Streaming Replication or Log WAL Shipping is deprecated) There is a daemon process started by the PostMaster
We don’t have to start the slave before the master. The slave can just wait for a master to start up.
-
First shutdown the master and set it up for replication by:
-
change
postgres.conf
wal_level = hot_standby max_wal_senders = 4 wal_keep_segments = 32
archive_mode = on archive_command = 'cp %p /data/archive/%f'
-
change
pg_hba.conf
:host replication repuser slaveip/32 md5
-
-
Configure the
pg_hba.conf
of the slave:host replication repuser masterip/32 md5
-
Initialize the cluster
On a local server, you can just copy the data
folder from the master to the slave or pg_basebackup -h localhost -D /opt/PostgresPlus/9.3AS/data1
but on a real set up you would follow these steps:
-
on the master:
postgres=# select pg_start_backup('cluster_init');
-
on the slave:
rsync -avz --delete --inplace --exclude-from=/srv/pgsql/do-not-sync root@195.244.165.68:/srv/pgsql/data/ /srv/pgsql/data (1)
1 with the postgres user -
on the master
postgres=# select pg_stop_backup();
" PAX process
Select * from pg_stat_activity select * from pg
Programming Notes
Notion
- Functional Programming
-
The meaning of the programs is centered around evaluating expressions rather than executing instructions.
This is the key to functional programming’s power — it allows improved modularization
In a functional program what is important is that it is a value oriented language; what we are building are sentences made from different values and higher order functions. The types and higher order values are defining the grammar of those sentences.
- Algebraic Data Type
-
A struct or new type, composed from other types either as a product or a sum type.
Name | Member | Inhabitant |
---|---|---|
Void |
0 |
|
Unit |
() |
1 |
Bool |
True, False |
2 |
Going from there you can define by sum a type with 3 elements:
data Add a b = AddL a | AddR b
-- or
data Either a b = Left a | Right b
-- if a is Bool and b is () you have got:
addValues = [AddL False, AddL True, AddR ()]
You can also use a product type with Mul:
data Mul a b = Mul a b
-- or
data (,) a b = (a, b)
mulValues = [Mul False False, Mul False True, Mul True False, Mul True True]
- Abstract Data Type (ADT)
-
A data type is said to be abstract when its implementation is hidden from the client. ADT’s are types which encapsulates a set of operations. The concept originates from CLU (Liskov, 1972) :
Modules → Partitions → ADTs
The canonical example is a Stack
for which we define a set of operations including a way to construct/get a new empty Stack.
This is very different and even dual to the concept of objects in OO. ADT operations belongs to its datatype whereas OO objects are implemented as collections of observations (methods) that can be performed upon them. The focus on observations, rather than construction, means that objects are best understood as co-algebras.
- Hash Value
-
Hashing
is a transformationAnyText → TextWithFixedSmallerSize
(array of bites) calleddigest
orhash value
with the following (ideal) properties:-
it is quick to compute
-
it is not reversible: you cannot get anyText from the digest
-
the
digest
is unique so that two differentAnyText
will always have a different digest.
-
The idea is to store this mapping in a database so that you use digest
as a representation for AnyText
(the digest
becomes the id/handle for the Text).
Given such a mapping you can also hash AnyText
, get a digest
and do a lookup in the table to see if the mapping already exists.
- CAP
-
-
Consistency
-
Availability
-
Partition
-
- Glossary
-
nibble
Half of one byte. So 4 bits/digits → 16 values
subroutineSynonym for function
subtypeCircle <: Shape
Object Orientated ProgrammingObjects by definition include mutable state → intensional object identity !!
Quotes
- Monitoring
Software should be designed from the start to be monitored
- ORM
ORMs are mixing different concerns. There were introduced by OO zealots to avoid the declarative nature of SQL. Now according to Martin Fowler they are just a way to get memory cache. Yes right but that is not the way ORMs have been sold ?
The whole ORM story looks like a complete disaster. Building a graph of objects in memory across sessions has proven to make little sense in many projects I had worked on.
If you deal with a relational database, abstracting it with mutable pojos is dubious at best. I am pretty convinced a nice API query interface such as LINQ can solve the problems of the myriads of SQL statements to handle.
Here is the problem, at the end, data are used to be feed into viewer. So let’s get this straight. Output JSON directly from a query language interface !
Scrible
Please do understand the difference between → and ⇒ … what you need here is "lead to" maybe implies
RAM : Heap & Stack
IDENDITY LABEL FOR A TIMELINE CONFLATE ID WITH STATE REF TYPE BOXES TO VALUES
How do we express polymorphism in UML. You mark class with a stereotype. You have to see class as something really global in UML. It is just a blueprint of code.
Elastic Search
Characteristics
EL is built upon the Lucene search engine. Everything is stored in an inversed index. It features:
-
HA
-
automatic index creation by generating a "mapping"
Terminology
- index
-
All documents live in an index. An index is roughtly the same as a database which mean it is just a namespace.
- type
-
Before version 6, a index could have on or more type. A type was like a table, a collection of similar thing. This notion of type is totally deprecated. In version 6, you still have to indicate one type (usually called
_doc
by convention. From version 7, the type would be optional and it will ultimately disappear from EL jargon. - document
-
A document is like a row. It is composed of field/value (a field is alike a column in RDB).
- version
-
ES only keeps one version of a document. The version number is kept by ES for engineering purpose but should not be used in the applicative/business layer.
- mappings
-
Map fields with data types.
- analysis
-
process of converting full text into
terms
for the inverted index - node
-
An instance of EL. Usually one per machine.
- cluster
-
A set of nodes. You might separate nodes into cluster because:
-
the usage/ownership/… of the data are different
-
the nodes are located in two different datacenter
-
- shards
-
By default each index is divided in 5 pieces called
shards
. This number is defined at index creation. A document will live on a single shard. EL tries to evenly distribute documents within an index among all the shards.A shard is a single instance of Lucene and would roughly reach for a size of about 10G.
- replica
-
Shards are replicated usually by 2 (number of replica = 1)
- segments
-
A shard is written on disks in multiple segment files.
Data types
-
Simple
-
text
: full text analyzed strings -
keyword
: sorting/aggregation of exact values (not analyzed). -
byte/short/integer/float/double
: numeric value -
date
-
boolean
-
ip
-
-
Hierarchical: object, nested
-
Range
-
Specialized: geo_point, percolator
APIs
GET blogs/_search
{
"query": {
"match": {
"content": "ingest nodes"
}
}
}
PUT blogs
{
"mappings": {
"_doc": {
"properties": {
"content": {
"type": "text"
},
...
}
}
}
}
GET _analyze
GET _cluster/state
PUT blogs/_settings
{
"settings": {
"number_of_replicas": 0 (1)
}
}
1 | you can dynamically change the number of replicas but not the number_of_shards |
POST _reindex
{
"source": {
"index": "blogs",
"query": {
...
}
},
"dest": {
"index": "blogs_fixed"
}
}
PUT _ingest/pipeline/fix_locales
{
"processors": [
{
"script": {
"source": """
if("".equals(ctx.locales)) {
ctx.locales = "en-en";
}
ctx.reindexBatch = 3;
"""
}
},
{
"split": {
"field": "locales",
"separator": ","
}
}
]
}
Node roles
-
master
eligibleOnly one master node per cluster. It is the sole capable of changing the cluster state. You need an odd number of eligible master nodes (quorum) to avoid split brains.
-
data
Hold the shards and execute CRUD operations.
-
ingest
Receive query.
-
coordinator
Receive client request. Every node is implicitly a coordinating node. Act as smart load balancers.
Cluster management
shard filtering
shard allocation awareness
Logstash
The main role of Logsatsh is to transform (in a centralized place) a stream of data before it is indexed in EL. For some data input such as SQL database it is the only official "supported" way to get the data into EL.
Beats
APM
Category theory
Abstract algebra of function In category theory we never look inside objects. All information about objects is encoded in the arrows (morphisms) between them.
- definition
-
A category is a bunch of objects together with
morphisms
[9]. The objects have not structure or meaning, actually they only serve to mark the beginning or end of an arrow. Morphisms are direct mappings between these objects [10] that preverse astructure
. The structure whatever it is characterizes the category.-
there must exist a morphism called
identity
(the zero) that maps an object into itself (e.g: 1A). -
the morphisms need to compose while respecting
associativity
:
h∘(g∘f) == (h∘g)∘f
-
In a functional programming language, morphisms/arrows are functions
and objects are types
.
Purescript
Quick setup
→ nix-env -iA nixos.purescript
→ nix-env -iA nixos.psc-package
→ nix-env -f ~/.config/nixpkgs/pin.nix -iA nodePackages.pulp
→ pulp --psc-package init
→ pulp build --to dist/test.js
→ cat > dist/test.html <<EOF
<!doctype html>
<html>
<head>
<title>Test Purescript</title>
<style>
body {
font-family: sans-serif;
max-width: 570px;
margin: auto;
}
</style>
</head>
<body>
<script src="test.js"></script>
</body>
</html>
EOF
Tips & tricks
Purescript | Haskell |
---|---|
|
|