HurdleDMR from Python

HurdleDMR.jl is a Julia implementation of the Hurdle Distributed Multinomial Regression (HDMR), as described in:

Kelly, Bryan, Asaf Manela, and Alan Moreira (2018). Text Selection. Working paper.

It includes a Julia implementation of the Distributed Multinomial Regression (DMR) model of Taddy (2015).

This tutorial explains how to use this package from Python via the PyJulia package.

Setup

Install Julia

First, install Julia itself. The easiest way to do that is from the download site https://julialang.org/downloads/. An alternative is to install JuliaPro from https://juliacomputing.com

Once installed, open julia in a terminal (or in Juno), press ] to activate package manager and add the following packages:

pkg> add HurdleDMR GLM Lasso

Install PyJulia

See the documentation here for installation instructions.

Because I use miniconda, I also had to run the following, but you might not:

In [1]:
from julia.api import Julia
jl = Julia(compiled_modules=False)

Add parallel workers and make HurdleDMR package available to workers

In [2]:
jl.eval("using Distributed")
from julia.Distributed import addprocs
addprocs(4)

from julia import HurdleDMR as hd
jl.eval("@everywhere using HurdleDMR")

Example Data

Setup your data into an n-by-p covars matrix, and a (sparse) n-by-d counts matrix. Here we generate some random data.

In [3]:
import numpy as np
from scipy import sparse

n = 100
p = 3
d = 4

np.random.seed(123)
m = 1 + np.random.poisson(5,n)
covars = np.random.uniform(0,1,(n,p))

q = [[0 + j*sum(covars[i,:]) for j in range(d)] for i in range(n)]
#rowsums = [sum(q[i]) for i in range(n)]
q = [q[i]/sum(q[i]) for i in range(n)]

#counts = sparse.csr_matrix(np.concatenate([[np.random.multinomial(m[i],q[i]) for i in range(n)]]))
counts = np.concatenate([[np.random.multinomial(m[i],q[i]) for i in range(n)]])
counts
Out[3]:
array([[0, 2, 3, 3],
       [0, 2, 2, 2],
       [0, 1, 2, 2],
       [0, 1, 1, 7],
       [0, 1, 1, 3],
       [0, 1, 1, 7],
       [0, 1, 2, 5],
       [0, 1, 0, 5],
       [0, 1, 5, 4],
       [0, 0, 1, 4],
       [0, 1, 2, 1],
       [0, 0, 3, 2],
       [0, 2, 3, 3],
       [0, 0, 2, 7],
       [0, 1, 5, 0],
       [0, 0, 2, 5],
       [0, 1, 0, 4],
       [0, 0, 1, 3],
       [0, 1, 3, 4],
       [0, 1, 3, 5],
       [0, 1, 1, 3],
       [0, 1, 0, 6],
       [0, 0, 2, 5],
       [0, 1, 3, 4],
       [0, 1, 3, 3],
       [0, 0, 2, 4],
       [0, 0, 1, 2],
       [0, 1, 0, 3],
       [0, 0, 2, 2],
       [0, 0, 2, 6],
       [0, 1, 1, 3],
       [0, 1, 0, 6],
       [0, 1, 2, 3],
       [0, 1, 2, 4],
       [0, 2, 0, 5],
       [0, 1, 1, 2],
       [0, 0, 2, 3],
       [0, 1, 1, 3],
       [0, 2, 2, 3],
       [0, 1, 6, 0],
       [0, 0, 0, 5],
       [0, 0, 2, 3],
       [0, 0, 3, 0],
       [0, 0, 2, 3],
       [0, 0, 2, 3],
       [0, 2, 2, 0],
       [0, 0, 2, 1],
       [0, 3, 3, 6],
       [0, 0, 1, 3],
       [0, 0, 1, 1],
       [0, 2, 1, 1],
       [0, 0, 2, 5],
       [0, 1, 3, 3],
       [0, 0, 0, 4],
       [0, 2, 0, 2],
       [0, 2, 2, 6],
       [0, 2, 1, 4],
       [0, 2, 1, 5],
       [0, 0, 1, 2],
       [0, 1, 2, 2],
       [0, 1, 1, 4],
       [0, 0, 1, 7],
       [0, 0, 4, 6],
       [0, 0, 2, 2],
       [0, 1, 1, 3],
       [0, 0, 0, 5],
       [0, 2, 2, 2],
       [0, 0, 1, 3],
       [0, 2, 2, 1],
       [0, 0, 2, 3],
       [0, 1, 1, 8],
       [0, 1, 1, 1],
       [0, 0, 2, 3],
       [0, 0, 2, 5],
       [0, 1, 2, 3],
       [0, 2, 2, 2],
       [0, 2, 2, 1],
       [0, 4, 1, 2],
       [0, 3, 0, 4],
       [0, 0, 3, 1],
       [0, 0, 0, 5],
       [0, 0, 4, 1],
       [0, 2, 1, 2],
       [0, 2, 4, 3],
       [0, 2, 3, 5],
       [0, 0, 0, 8],
       [0, 1, 0, 4],
       [0, 1, 0, 3],
       [0, 1, 3, 2],
       [0, 1, 1, 6],
       [0, 0, 1, 5],
       [0, 0, 1, 4],
       [0, 0, 3, 2],
       [0, 1, 3, 2],
       [0, 3, 0, 3],
       [0, 2, 7, 5],
       [0, 0, 4, 3],
       [0, 3, 1, 1],
       [0, 0, 3, 3],
       [0, 2, 3, 0]])

Distributed Multinomial Regression (DMR)

The Distributed Multinomial Regression (DMR) model of Taddy (2015) is a highly scalable approximation to the Multinomial using distributed (independent, parallel) Poisson regressions, one for each of the d categories (columns) of a large counts matrix, on the covars.

To fit a DMR:

In [4]:
m = hd.dmr(covars, counts)

We can get the coefficients matrix for each variable + intercept as usual with

In [5]:
hd.coef(m)
Out[5]:
array([[ 0.        , -1.94507425, -1.28828706, -0.59041306],
       [ 0.        ,  0.1056461 ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.1268672 ,  0.        ]])

By default we only return the AICc maximizing coefficients. To also get back the entire regulatrization paths, run

In [6]:
paths = hd.dmrpaths(covars, counts)

We can now select, for example the coefficients that minimize 10-fold CV mse (takes a while)

In [7]:
jl.eval("using Lasso: MinCVmse")
from julia import Lasso
gen = jl.eval("MinCVKfold{MinCVmse}(10)")
hd.coef(paths, gen)
Out[7]:
array([[ 0.00000000e+00, -1.89167038e+00, -1.22050226e+00,
        -5.90413062e-01],
       [ 0.00000000e+00,  3.18787704e-11,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  2.97862348e-07,
         0.00000000e+00]])

Hurdle Distributed Multinomial Regression (HDMR)

For highly sparse counts, as is often the case with text that is selected for various reasons, the Hurdle Distributed Multinomial Regression (HDMR) model of Kelly, Manela, and Moreira (2018), may be superior to the DMR. It approximates a higher dispersion Multinomial using distributed (independent, parallel) Hurdle regressions, one for each of the d categories (columns) of a large counts matrix, on the covars. It allows a potentially different sets of covariates to explain category inclusion ($h=1{c>0}$), and repetition ($c>0$).

Both the model for zeroes and for positive counts are regularized by default, using GammaLassoPath, picking the AICc optimal segment of the regularization path.

HDMR can be fitted:

In [8]:
m = hd.hdmr(covars, counts, inpos=[1,2], inzero=[1,2,3])

We can get the coefficients matrix for each variable + intercept as usual with

In [9]:
coefspos, coefszero = hd.coef(m)
print("coefspos:\n", coefspos)
print("coefszero:\n", coefszero)
coefspos:
 [[ 0.         -2.18288411 -1.18060442 -0.41828599]
 [ 0.          0.33062404  0.          0.        ]
 [ 0.          0.          0.02997958  0.08338104]]
coefszero:
 [[ 0.          0.04616614  1.36309252  3.07912436]
 [ 0.          0.          0.         -0.67010761]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.          0.          0.        ]]

By default we only return the AICc maximizing coefficients. To also get back the entire regulatrization paths, run

In [10]:
paths = hd.hdmrpaths(covars, counts)

hd.coef(paths, Lasso.AllSeg())
Out[10]:
(array([[[ 0.00000000e+00, -2.02133392e+00, -1.16575159e+00,
          -3.76235470e-01],
         [ 0.00000000e+00,  2.90768861e-08,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.88213201e-12,
           1.36664985e-10],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.05931209e+00, -1.17116227e+00,
          -3.80396163e-01],
         [ 0.00000000e+00,  7.93643648e-02,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.09392297e-02,
           8.30076060e-03],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.09423553e+00, -1.17609960e+00,
          -3.84191708e-01],
         [ 0.00000000e+00,  1.51431219e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  2.09033261e-02,
           1.58632356e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.12633252e+00, -1.18060442e+00,
          -3.87653804e-01],
         [ 0.00000000e+00,  2.16918343e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  2.99795817e-02,
           2.27532006e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.15581588e+00, -1.18671462e+00,
          -3.90811453e-01],
         [ 0.00000000e+00,  2.76460942e-01,  4.61259056e-03,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  3.76960464e-02,
           2.90305459e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18288411e+00, -1.19351483e+00,
          -3.93691184e-01],
         [ 0.00000000e+00,  3.30624044e-01,  1.16365381e-02,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  4.43915698e-02,
           3.47498089e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18509703e+00, -1.19971939e+00,
          -3.96317254e-01],
         [ 0.00000000e+00,  3.87485773e-01,  1.80355614e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -5.11025873e-02,  5.04946902e-02,
           3.99606571e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18608377e+00, -1.20537987e+00,
          -3.98711836e-01],
         [ 0.00000000e+00,  4.39887722e-01,  2.38654722e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -1.00972759e-01,  5.60576447e-02,
           4.47083262e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18722497e+00, -1.21054341e+00,
          -4.00895193e-01],
         [ 0.00000000e+00,  4.87737174e-01,  2.91769653e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -1.46676441e-01,  6.11281004e-02,
           4.90340183e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18846599e+00, -1.21525323e+00,
          -4.02885839e-01],
         [ 0.00000000e+00,  5.31416284e-01,  3.40163139e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -1.88537708e-01,  6.57495256e-02,
           5.29752634e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18976444e+00, -1.21954865e+00,
          -4.04700683e-01],
         [ 0.00000000e+00,  5.71279194e-01,  3.84251907e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -2.26859921e-01,  6.99616237e-02,
           5.65662484e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.19108719e+00, -1.22346595e+00,
          -4.06355168e-01],
         [ 0.00000000e+00,  6.07651872e-01,  4.24422459e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -2.61926040e-01,  7.38005241e-02,
           5.98381161e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.19240871e+00, -1.22703811e+00,
          -4.07863396e-01],
         [ 0.00000000e+00,  6.40833982e-01,  4.61022281e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -2.93999344e-01,  7.72992307e-02,
           6.28192369e-02],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.19238025e+00, -1.23029533e+00,
          -4.09238237e-01],
         [ 0.00000000e+00,  6.70631777e-01,  4.94369808e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -3.23271113e-01,  8.04878185e-02,
           6.55354565e-02],
         [ 0.00000000e+00, -2.16424573e-03,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.18350673e+00, -1.23326513e+00,
          -4.10491440e-01],
         [ 0.00000000e+00,  6.94673188e-01,  5.24752660e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -3.49641464e-01,  8.33937415e-02,
           6.80103213e-02],
         [ 0.00000000e+00, -1.87391863e-02,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.17555276e+00, -1.23597274e+00,
          -4.11633728e-01],
         [ 0.00000000e+00,  7.16571980e-01,  5.52435290e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -3.73689671e-01,  8.60420042e-02,
           7.02652830e-02],
         [ 0.00000000e+00, -3.37752323e-02,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.16841470e+00, -1.23844115e+00,
          -4.12674883e-01],
         [ 0.00000000e+00,  7.36519747e-01,  5.77657182e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -3.95617699e-01,  8.84554223e-02,
           7.23198857e-02],
         [ 0.00000000e+00, -4.74203130e-02,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.16200061e+00, -1.23950189e+00,
          -4.13623832e-01],
         [ 0.00000000e+00,  7.54688730e-01,  5.97651354e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.15610148e-01,  9.07380663e-02,
           7.41919352e-02],
         [ 0.00000000e+00, -5.98077251e-02, -2.01295591e-03,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.15623264e+00, -1.24027038e+00,
          -4.14488717e-01],
         [ 0.00000000e+00,  7.71240849e-01,  6.15374039e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.33836708e-01,  9.28323970e-02,
           7.58976545e-02],
         [ 0.00000000e+00, -7.10563849e-02, -4.18515353e-03,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.15103859e+00, -1.24097200e+00,
          -4.15276967e-01],
         [ 0.00000000e+00,  7.86316683e-01,  6.31524154e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.50451628e-01,  9.47411754e-02,
           7.74518242e-02],
         [ 0.00000000e+00, -8.12746149e-02, -6.16430919e-03,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.14635745e+00, -1.24161255e+00,
          -4.15995356e-01],
         [ 0.00000000e+00,  8.00048953e-01,  6.46243290e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.65596577e-01,  9.64807535e-02,
           7.88679109e-02],
         [ 0.00000000e+00, -9.05591315e-02, -7.96755219e-03,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.14213433e+00, -1.24219749e+00,
          -4.16500855e-01],
         [ 0.00000000e+00,  8.12556610e-01,  6.59662780e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.79400722e-01,  9.80660122e-02,
           8.01228005e-02],
         [ 0.00000000e+00, -9.89975077e-02, -9.61045464e-03,
          -2.53526513e-04]],
 
        [[ 0.00000000e+00, -2.13832223e+00, -1.24273090e+00,
          -4.16524455e-01],
         [ 0.00000000e+00,  8.23950636e-01,  6.71883457e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -4.91982430e-01,  9.95108644e-02,
           8.11625791e-02],
         [ 0.00000000e+00, -1.06668271e-01, -1.11074815e-02,
          -1.22738202e-03]],
 
        [[ 0.00000000e+00, -2.13487848e+00, -1.24321785e+00,
          -4.16546101e-01],
         [ 0.00000000e+00,  8.34330240e-01,  6.83025049e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -5.03449495e-01,  1.00827492e-01,
           8.21099308e-02],
         [ 0.00000000e+00, -1.13642695e-01, -1.24714014e-02,
          -2.11472118e-03]],
 
        [[ 0.00000000e+00, -2.13176475e+00, -1.24366215e+00,
          -4.16566026e-01],
         [ 0.00000000e+00,  8.43784853e-01,  6.93178643e-02,
           0.00000000e+00],
         [ 0.00000000e+00, -5.13900160e-01,  1.02027331e-01,
           8.29732207e-02],
         [ 0.00000000e+00, -1.19985376e-01, -1.37141128e-02,
          -2.92321139e-03]],
 
        [[ 0.00000000e+00, -2.12894736e+00,  0.00000000e+00,
          -4.16584271e-01],
         [ 0.00000000e+00,  8.52396617e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.23424173e-01,  0.00000000e+00,
           8.37597669e-02],
         [ 0.00000000e+00, -1.25754604e-01,  0.00000000e+00,
          -3.65987824e-03]],
 
        [[ 0.00000000e+00, -2.12639732e+00,  0.00000000e+00,
          -4.16601040e-01],
         [ 0.00000000e+00,  8.60242101e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.32103692e-01,  0.00000000e+00,
           8.44765171e-02],
         [ 0.00000000e+00, -1.31002791e-01,  0.00000000e+00,
          -4.33108646e-03]],
 
        [[ 0.00000000e+00, -2.12408719e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.67388179e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.40013206e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.35777947e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.12199430e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.73898999e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.47221165e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.40122941e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.12009624e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.79829144e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.53789376e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.44077371e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11837504e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.85232119e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.59774802e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.47676414e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11681330e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.90154267e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.65228992e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.50952442e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11539572e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.94638435e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.70199047e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.53934713e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11410858e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  8.98723642e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.74727898e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.56649788e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11293952e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.02445426e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.78854679e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.59121790e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11187742e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.05836154e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.82615060e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.61372631e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11091236e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.08925514e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.86041586e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.63422184e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.11003505e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.11739915e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.89163815e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.65288631e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.10923735e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.14303869e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.92008759e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.66988416e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.10851208e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.16640013e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.94601088e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.68536426e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.10785258e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.18768670e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.96963235e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.69946260e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.10725249e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  9.20707718e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -5.99115538e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00, -1.71230413e-01,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]]]),
 array([[[ 0.00000000e+00,  4.61661434e-02,  1.36309252e+00,
           2.70806114e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -2.06142611e-05],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           0.00000000e+00]],
 
        [[ 0.00000000e+00,  2.85616460e-02,  1.31383731e+00,
           2.79335381e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.59061397e-01],
         [ 0.00000000e+00,  3.38630981e-02,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  9.36637611e-02,
           0.00000000e+00]],
 
        [[ 0.00000000e+00,  1.25399866e-02,  1.26935403e+00,
           2.87277112e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -3.04323034e-01],
         [ 0.00000000e+00,  6.47156833e-02,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.79023813e-01,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.04337784e-03,  1.22912935e+00,
           2.94665932e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -4.37162628e-01],
         [ 0.00000000e+00,  9.28269244e-02,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  2.56859150e-01,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -1.53193133e-02,  1.19271630e+00,
           3.01534195e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -5.58750354e-01],
         [ 0.00000000e+00,  1.18441468e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  3.27862418e-01,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -2.74064001e-02,  1.15972356e+00,
           3.07912436e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -6.70107611e-01],
         [ 0.00000000e+00,  1.41781788e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  3.92652538e-01,
           0.00000000e+00]],
 
        [[ 0.00000000e+00, -3.84121692e-02,  1.12980669e+00,
           3.16724819e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -7.80647397e-01],
         [ 0.00000000e+00,  1.63050287e-01,  0.00000000e+00,
           0.00000000e+00],
         [ 0.00000000e+00,  0.00000000e+00,  4.51785276e-01,
          -4.66554241e-02]],
 
        [[ 0.00000000e+00, -4.20435289e-02,  1.10266103e+00,
           3.32311386e+00],
         [ 0.00000000e+00, -1.24407718e-02,  0.00000000e+00,
          -8.94467551e-01],
         [ 0.00000000e+00,  1.82447093e-01,  0.00000000e+00,
          -7.28615975e-02],
         [ 0.00000000e+00,  0.00000000e+00,  5.05761709e-01,
          -1.43806892e-01]],
 
        [[ 0.00000000e+00, -4.22947674e-02,  1.08817310e+00,
           3.48370582e+00],
         [ 0.00000000e+00, -2.97187640e-02,  0.00000000e+00,
          -1.00091662e+00],
         [ 0.00000000e+00,  2.00137850e-01, -1.89963162e-02,
          -1.66064974e-01],
         [ 0.00000000e+00,  0.00000000e+00,  5.54656529e-01,
          -2.34399985e-01]],
 
        [[ 0.00000000e+00, -4.25159707e-02,  1.09214696e+00,
           3.63334616e+00],
         [ 0.00000000e+00, -4.54694266e-02,  0.00000000e+00,
          -1.09944000e+00],
         [ 0.00000000e+00,  2.16267024e-01, -6.81794362e-02,
          -2.52240140e-01],
         [ 0.00000000e+00,  0.00000000e+00,  5.98688726e-01,
          -3.17458672e-01]],
 
        [[ 0.00000000e+00, -4.27110815e-02,  1.09598990e+00,
           3.77260703e+00],
         [ 0.00000000e+00, -5.98277741e-02,  0.00000000e+00,
          -1.19056324e+00],
         [ 0.00000000e+00,  2.30972233e-01, -1.13162886e-01,
          -3.31887477e-01],
         [ 0.00000000e+00,  0.00000000e+00,  6.38926994e-01,
          -3.93632966e-01]],
 
        [[ 0.00000000e+00, -4.28834834e-02,  1.09967816e+00,
           3.90204279e+00],
         [ 0.00000000e+00, -7.29167919e-02,  0.00000000e+00,
          -1.27477720e+00],
         [ 0.00000000e+00,  2.44378916e-01, -1.54302018e-01,
          -4.05464127e-01],
         [ 0.00000000e+00,  0.00000000e+00,  6.75698055e-01,
          -4.63502203e-01]],
 
        [[ 0.00000000e+00, -4.30360850e-02,  1.10319550e+00,
           4.02219279e+00],
         [ 0.00000000e+00, -8.48485452e-02,  0.00000000e+00,
          -1.35254253e+00],
         [ 0.00000000e+00,  2.56601440e-01, -1.91920779e-01,
          -4.73391855e-01],
         [ 0.00000000e+00,  0.00000000e+00,  7.09299114e-01,
          -5.27587358e-01]],
 
        [[ 0.00000000e+00, -3.82712274e-02,  1.10653190e+00,
           4.13358329e+00],
         [ 0.00000000e+00, -9.68005407e-02,  0.00000000e+00,
          -1.42429398e+00],
         [ 0.00000000e+00,  2.67852884e-01, -2.26315411e-01,
          -5.36062913e-01],
         [ 0.00000000e+00, -8.20534596e-03,  7.40001074e-01,
          -5.86360459e-01]],
 
        [[ 0.00000000e+00, -3.15994970e-02,  1.10968221e+00,
           4.23672728e+00],
         [ 0.00000000e+00, -1.08205050e-01,  0.00000000e+00,
          -1.49044244e+00],
         [ 0.00000000e+00,  2.78161708e-01, -2.57757175e-01,
          -5.93843955e-01],
         [ 0.00000000e+00, -1.95769068e-02,  7.68051260e-01,
          -6.40251593e-01]],
 
        [[ 0.00000000e+00, -2.55146946e-02,  1.11264511e+00,
           4.33212374e+00],
         [ 0.00000000e+00, -1.18599123e-01,  0.00000000e+00,
          -1.55137660e+00],
         [ 0.00000000e+00,  2.87559334e-01, -2.86494678e-01,
          -6.47078855e-01],
         [ 0.00000000e+00, -2.99430237e-02,  7.93675740e-01,
          -6.89654330e-01]],
 
        [[ 0.00000000e+00, -1.99650707e-02,  1.11542225e+00,
           4.42021706e+00],
         [ 0.00000000e+00, -1.28073136e-01,  0.00000000e+00,
          -1.60741931e+00],
         [ 0.00000000e+00,  2.96126217e-01, -3.12755904e-01,
          -6.96078474e-01],
         [ 0.00000000e+00, -3.93925449e-02,  8.17081326e-01,
          -7.34922008e-01]],
 
        [[ 0.00000000e+00, -1.49038722e-02,  1.11801753e+00,
           4.50155802e+00],
         [ 0.00000000e+00, -1.36708468e-01,  0.00000000e+00,
          -1.65901256e+00],
         [ 0.00000000e+00,  3.03935620e-01, -3.36749999e-01,
          -7.41172412e-01],
         [ 0.00000000e+00, -4.80063459e-02,  8.38457334e-01,
          -7.76403840e-01]],
 
        [[ 0.00000000e+00, -1.02887156e-02,  1.12043652e+00,
           4.57655239e+00],
         [ 0.00000000e+00, -1.44578676e-01,  0.00000000e+00,
          -1.70643374e+00],
         [ 0.00000000e+00,  3.11054287e-01, -3.58668855e-01,
          -7.82632585e-01],
         [ 0.00000000e+00, -5.58580898e-02,  8.57977122e-01,
          -8.14397247e-01]],
 
        [[ 0.00000000e+00, -6.08037107e-03,  1.12268601e+00,
           4.64563080e+00],
         [ 0.00000000e+00, -1.51751775e-01,  0.00000000e+00,
          -1.74999079e+00],
         [ 0.00000000e+00,  3.17543186e-01, -3.78688542e-01,
          -8.20727078e-01],
         [ 0.00000000e+00, -6.30150741e-02,  8.75799475e-01,
          -8.49184614e-01]],
 
        [[ 0.00000000e+00, -2.24332009e-03,  1.12477360e+00,
           4.70920463e+00],
         [ 0.00000000e+00, -1.58289182e-01,  0.00000000e+00,
          -1.78997342e+00],
         [ 0.00000000e+00,  3.23457859e-01, -3.96970594e-01,
          -8.55707535e-01],
         [ 0.00000000e+00, -6.95385938e-02,  8.92069840e-01,
          -8.81026555e-01]],
 
        [[ 0.00000000e+00,  1.25539588e-03,  1.12670744e+00,
           4.76766418e+00],
         [ 0.00000000e+00, -1.64247867e-01,  0.00000000e+00,
          -1.82665222e+00],
         [ 0.00000000e+00,  3.28849064e-01, -4.13663185e-01,
          -8.87809443e-01],
         [ 0.00000000e+00, -7.54846745e-02,  9.06921443e-01,
          -9.10163355e-01]],
 
        [[ 0.00000000e+00,  4.44509906e-03,  1.12849596e+00,
           4.82138310e+00],
         [ 0.00000000e+00, -1.69678320e-01,  0.00000000e+00,
          -1.86028452e+00],
         [ 0.00000000e+00,  3.33762945e-01, -4.28902200e-01,
          -9.17254349e-01],
         [ 0.00000000e+00, -8.09042219e-02,  9.20476306e-01,
          -9.36817475e-01]],
 
        [[ 0.00000000e+00,  7.35305627e-03,  1.13014770e+00,
           4.87071040e+00],
         [ 0.00000000e+00, -1.74627500e-01,  0.00000000e+00,
          -1.89110658e+00],
         [ 0.00000000e+00,  3.38241680e-01, -4.42812220e-01,
          -9.44247692e-01],
         [ 0.00000000e+00, -8.58437803e-02,  9.32846172e-01,
          -9.61193345e-01]],
 
        [[ 0.00000000e+00,  1.00040425e-02,  1.12883320e+00,
           4.91597576e+00],
         [ 0.00000000e+00, -1.79138000e-01,  4.66945358e-03,
          -1.91933946e+00],
         [ 0.00000000e+00,  3.42323714e-01, -4.55409688e-01,
          -9.68981782e-01],
         [ 0.00000000e+00, -9.03457673e-02,  9.44779814e-01,
          -9.83479628e-01]],
 
        [[ 0.00000000e+00,  1.24206671e-02,  1.12242638e+00,
           4.95748940e+00],
         [ 0.00000000e+00, -1.83248644e-01,  1.75247911e-02,
          -1.94518989e+00],
         [ 0.00000000e+00,  3.46044105e-01, -4.66723824e-01,
          -9.91635464e-01],
         [ 0.00000000e+00, -9.44488630e-02,  9.56861949e-01,
          -1.00385033e+00]],
 
        [[ 0.00000000e+00,  1.46234871e-02,  1.11659996e+00,
           4.99554147e+00],
         [ 0.00000000e+00, -1.86994679e-01,  2.92466160e-02,
          -1.96884932e+00],
         [ 0.00000000e+00,  3.49434810e-01, -4.77045390e-01,
          -1.01237514e+00],
         [ 0.00000000e+00, -9.81883203e-02,  9.67888982e-01,
          -1.02246561e+00]],
 
        [[ 0.00000000e+00,  1.66315708e-02,  1.11130021e+00,
           5.03040178e+00],
         [ 0.00000000e+00, -1.90408819e-01,  3.99346509e-02,
          -1.99049400e+00],
         [ 0.00000000e+00,  3.52525030e-01, -4.86460681e-01,
          -1.03135473e+00],
         [ 0.00000000e+00, -1.01596360e-01,  9.77951866e-01,
          -1.03947263e+00]],
 
        [[ 0.00000000e+00,  1.84618627e-02,  1.10647945e+00,
           5.06232549e+00],
         [ 0.00000000e+00, -1.93520033e-01,  4.96785042e-02,
          -2.01029116e+00],
         [ 0.00000000e+00,  3.55341303e-01, -4.95048649e-01,
          -1.04871804e+00],
         [ 0.00000000e+00, -1.04702251e-01,  9.87133691e-01,
          -1.05500752e+00]],
 
        [[ 0.00000000e+00,  2.01300610e-02,  1.10209333e+00,
           5.09154573e+00],
         [ 0.00000000e+00, -1.96355178e-01,  5.85618462e-02,
          -2.02839078e+00],
         [ 0.00000000e+00,  3.57907873e-01, -5.02881363e-01,
          -1.06459689e+00],
         [ 0.00000000e+00, -1.07532737e-01,  9.95510780e-01,
          -1.06919452e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.09810241e+00,
           5.11828176e+00],
         [ 0.00000000e+00,  0.00000000e+00,  6.66598653e-02,
          -2.04493444e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.10024747e-01,
          -1.07911401e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.00315282e+00,
          -1.08214837e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.09446977e+00,
           5.14273550e+00],
         [ 0.00000000e+00,  0.00000000e+00,  7.40431334e-02,
          -2.06005137e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.16538961e-01,
          -1.09238217e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.01012386e+00,
          -1.09397413e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.09116391e+00,
           5.16509509e+00],
         [ 0.00000000e+00,  0.00000000e+00,  8.07727873e-02,
          -2.07386193e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.22479109e-01,
          -1.10450591e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.01648199e+00,
          -1.10476847e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.08815441e+00,
           5.18553297e+00],
         [ 0.00000000e+00,  0.00000000e+00,  8.69076785e-02,
          -2.08647533e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.27895407e-01,
          -1.11558111e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.02228082e+00,
          -1.11461975e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.08541467e+00,
           5.20420969e+00],
         [ 0.00000000e+00,  0.00000000e+00,  9.24998230e-02,
          -2.09799362e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.32833787e-01,
          -1.12569633e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.02756907e+00,
          -1.12360927e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.08292030e+00,
           5.22127316e+00],
         [ 0.00000000e+00,  0.00000000e+00,  9.75970846e-02,
          -2.10851033e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.37336182e-01,
          -1.13493309e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.03239136e+00,
          -1.13181148e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.08064899e+00,
           5.23685710e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.02243417e-01,
          -2.11810873e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.41440875e-01,
          -1.14336549e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.03678851e+00,
          -1.13929406e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.07858114e+00,
           5.25108969e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.06477782e-01,
          -2.12687077e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.45182848e-01,
          -1.15106314e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.04079763e+00,
          -1.14612002e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.07669795e+00,
           5.26408253e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.10337383e-01,
          -2.13486484e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.48593989e-01,
          -1.15808803e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.04445287e+00,
          -1.15234576e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  1.07498285e+00,
           5.27594377e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.13855278e-01,
          -2.14215972e+00],
         [ 0.00000000e+00,  0.00000000e+00, -5.51703426e-01,
          -1.16449873e+00],
         [ 0.00000000e+00,  0.00000000e+00,  1.04778529e+00,
          -1.15802400e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           5.28676967e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -2.14881517e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.17034803e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.16320235e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           5.29664721e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -2.15488455e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.17568375e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.16792409e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           5.30565687e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -2.16041717e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.18055034e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.17222887e+00]],
 
        [[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
           5.31388116e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -2.16546813e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.18499049e+00],
         [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          -1.17615477e+00]]]))

Sufficient reduction projection

A sufficient reduction projection summarizes the counts, much like a sufficient statistic, and is useful for reducing the d dimensional counts in a potentially much lower dimension matrix z.

To get a sufficient reduction projection in direction of vy for the above example

In [11]:
z = hd.srproj(m,counts,1,1)
z
Out[11]:
array([[ 0.08265601, -0.2233692 ,  8.        ,  3.        ],
       [ 0.11020801, -0.2233692 ,  6.        ,  3.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.036736  , -0.2233692 ,  9.        ,  3.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.036736  , -0.2233692 ,  9.        ,  3.        ],
       [ 0.04132801, -0.2233692 ,  8.        ,  3.        ],
       [ 0.05510401, -0.33505381,  6.        ,  2.        ],
       [ 0.0330624 , -0.2233692 , 10.        ,  3.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.08265601, -0.2233692 ,  4.        ,  3.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.08265601, -0.2233692 ,  8.        ,  3.        ],
       [ 0.        , -0.33505381,  9.        ,  2.        ],
       [ 0.05510401,  0.        ,  6.        ,  2.        ],
       [ 0.        , -0.33505381,  7.        ,  2.        ],
       [ 0.06612481, -0.33505381,  5.        ,  2.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.04132801, -0.2233692 ,  8.        ,  3.        ],
       [ 0.036736  , -0.2233692 ,  9.        ,  3.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.04723201, -0.33505381,  7.        ,  2.        ],
       [ 0.        , -0.33505381,  7.        ,  2.        ],
       [ 0.04132801, -0.2233692 ,  8.        ,  3.        ],
       [ 0.04723201, -0.2233692 ,  7.        ,  3.        ],
       [ 0.        , -0.33505381,  6.        ,  2.        ],
       [ 0.        , -0.33505381,  3.        ,  2.        ],
       [ 0.08265601, -0.33505381,  4.        ,  2.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.        , -0.33505381,  8.        ,  2.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.04723201, -0.33505381,  7.        ,  2.        ],
       [ 0.05510401, -0.2233692 ,  6.        ,  3.        ],
       [ 0.04723201, -0.2233692 ,  7.        ,  3.        ],
       [ 0.09446401, -0.33505381,  7.        ,  2.        ],
       [ 0.08265601, -0.2233692 ,  4.        ,  3.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.09446401, -0.2233692 ,  7.        ,  3.        ],
       [ 0.04723201,  0.        ,  7.        ,  2.        ],
       [ 0.        , -0.67010761,  5.        ,  1.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.        ,  0.        ,  3.        ,  1.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.16531202,  0.        ,  4.        ,  2.        ],
       [ 0.        , -0.33505381,  3.        ,  2.        ],
       [ 0.08265601, -0.2233692 , 12.        ,  3.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.        , -0.33505381,  2.        ,  2.        ],
       [ 0.16531202, -0.2233692 ,  4.        ,  3.        ],
       [ 0.        , -0.33505381,  7.        ,  2.        ],
       [ 0.04723201, -0.2233692 ,  7.        ,  3.        ],
       [ 0.        , -0.67010761,  4.        ,  1.        ],
       [ 0.16531202, -0.33505381,  4.        ,  2.        ],
       [ 0.06612481, -0.2233692 , 10.        ,  3.        ],
       [ 0.09446401, -0.2233692 ,  7.        ,  3.        ],
       [ 0.08265601, -0.2233692 ,  8.        ,  3.        ],
       [ 0.        , -0.33505381,  3.        ,  2.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.05510401, -0.2233692 ,  6.        ,  3.        ],
       [ 0.        , -0.33505381,  8.        ,  2.        ],
       [ 0.        , -0.33505381, 10.        ,  2.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.06612481, -0.2233692 ,  5.        ,  3.        ],
       [ 0.        , -0.67010761,  5.        ,  1.        ],
       [ 0.11020801, -0.2233692 ,  6.        ,  3.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.13224962, -0.2233692 ,  5.        ,  3.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.0330624 , -0.2233692 , 10.        ,  3.        ],
       [ 0.11020801, -0.2233692 ,  3.        ,  3.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.        , -0.33505381,  7.        ,  2.        ],
       [ 0.05510401, -0.2233692 ,  6.        ,  3.        ],
       [ 0.11020801, -0.2233692 ,  6.        ,  3.        ],
       [ 0.13224962, -0.2233692 ,  5.        ,  3.        ],
       [ 0.18892802, -0.2233692 ,  7.        ,  3.        ],
       [ 0.14169602, -0.33505381,  7.        ,  2.        ],
       [ 0.        , -0.33505381,  4.        ,  2.        ],
       [ 0.        , -0.67010761,  5.        ,  1.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.13224962, -0.2233692 ,  5.        ,  3.        ],
       [ 0.07347201, -0.2233692 ,  9.        ,  3.        ],
       [ 0.06612481, -0.2233692 , 10.        ,  3.        ],
       [ 0.        , -0.67010761,  8.        ,  1.        ],
       [ 0.06612481, -0.33505381,  5.        ,  2.        ],
       [ 0.08265601, -0.33505381,  4.        ,  2.        ],
       [ 0.05510401, -0.2233692 ,  6.        ,  3.        ],
       [ 0.04132801, -0.2233692 ,  8.        ,  3.        ],
       [ 0.        , -0.33505381,  6.        ,  2.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.        , -0.33505381,  5.        ,  2.        ],
       [ 0.05510401, -0.2233692 ,  6.        ,  3.        ],
       [ 0.16531202, -0.33505381,  6.        ,  2.        ],
       [ 0.04723201, -0.2233692 , 14.        ,  3.        ],
       [ 0.        , -0.33505381,  7.        ,  2.        ],
       [ 0.19837443, -0.2233692 ,  5.        ,  3.        ],
       [ 0.        , -0.33505381,  6.        ,  2.        ],
       [ 0.13224962,  0.        ,  5.        ,  2.        ]])

Here, the first column is the SR projection from the model for positive counts, the second is the the SR projection from the model for hurdle crossing (zeros), and the third is the total count for each observation.

Counts Inverse Regression (CIR)

Counts inverse regression allows us to predict a covariate with the counts and other covariates. Here we use hdmr for the backward regression and another model for the forward regression. This can be accomplished with a single command, by fitting a CIR{HDMR,FM} where the forward model is FM <: RegressionModel.

In [12]:
jl.eval("using GLM: LinearModel")
spec = jl.eval("CIR{HDMR,LinearModel}")
cir = hd.fit(spec,covars,counts,1, 
             select=Lasso.MinBIC(), nocounts=True)
cir
Out[12]:
<PyCall.jlwrap CIR{HDMR,LinearModel}(1, [1, 2], HDMRCoefs{InclusionRepetition,Array{Float64,2},Lasso.MinBIC,UnitRange{Int64}}([0.0 -2.18288 -1.1806 -0.415995; 0.0 0.330624 0.0 0.0; 0.0 0.0 0.0299796 0.0788679; 0.0 0.0 0.0 0.0], [0.0 0.0461661 1.36309 3.07912; 0.0 0.0 0.0 -0.670108; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], true, 100, 4, 1:3, 1:3, Lasso.MinBIC()), LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
───────────────────────────────────────────────────────────────────────
       Estimate  Std. Error    t value  Pr(>|t|)   Lower 95%  Upper 95%
───────────────────────────────────────────────────────────────────────
x1   0.887227     0.23313     3.80572     0.0003   0.424277   1.35018  
x2  -0.00275704   0.101541   -0.027152    0.9784  -0.204397   0.198883 
x3  -0.132434     0.101139   -1.30943     0.1936  -0.333275   0.0684076
x4   0.393855     0.697003    0.565069    0.5734  -0.990255   1.77797  
x5   0.236098     0.319183    0.739695    0.4613  -0.397736   0.869933 
x6  -0.00979175   0.0158525  -0.617678    0.5383  -0.0412717  0.0216882
x7  -0.0817407    0.0717278  -1.1396      0.2574  -0.224178   0.0606965
───────────────────────────────────────────────────────────────────────
, LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}}:

Coefficients:
──────────────────────────────────────────────────────────────────────
      Estimate  Std. Error     t value  Pr(>|t|)  Lower 95%  Upper 95%
──────────────────────────────────────────────────────────────────────
x1   0.57844     0.0794739   7.27837      <1e-10   0.420707  0.736174 
x2   0.0068076   0.1002      0.0679403    0.9460  -0.192061  0.205676 
x3  -0.13156     0.0990448  -1.32829      0.1872  -0.328136  0.0650164
──────────────────────────────────────────────────────────────────────
)>

where the nocounts=True means we also fit a benchmark model without counts, and select=Lasso.MinBIC() selects BIC minimizing Lasso segments for each category.

we can get the forward and backward model coefficients with

In [13]:
hd.coefbwd(cir)
Out[13]:
(array([[ 0.        , -2.18288411, -1.18060442, -0.41599536],
        [ 0.        ,  0.33062404,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.02997958,  0.07886791],
        [ 0.        ,  0.        ,  0.        ,  0.        ]]),
 array([[ 0.        ,  0.04616614,  1.36309252,  3.07912436],
        [ 0.        ,  0.        ,  0.        , -0.67010761],
        [ 0.        ,  0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.        ,  0.        ]]))
In [14]:
hd.coeffwd(cir)
Out[14]:
array([ 0.88722732, -0.00275704, -0.13243369,  0.39385493,  0.23609827,
       -0.00979175, -0.08174069])

The fitted model can be used to predict vy with new data

In [15]:
hd.predict(cir, covars[range(1,10),:], counts[range(1,10),:])
Out[15]:
array([0.55374217, 0.44339513, 0.41348183, 0.44716221, 0.42806591,
       0.46489547, 0.48025307, 0.45795602, 0.56790706])

We can also predict only with the other covariates, which in this case is just a linear regression

In [16]:
hd.predict(cir, covars[range(1,10),:], counts[range(1,10),:], nocounts=True)
Out[16]:
array([0.55869446, 0.46528463, 0.48635113, 0.4689304 , 0.49773856,
       0.51824815, 0.45533037, 0.53886851, 0.55713808])

Kelly, Manela, and Moreira (2018) show that the differences between DMR and HDMR can be substantial in some cases, especially when the counts data is highly sparse.

Please reference the paper for additional details and example applications.