Convolutional neural networks have proven themselves to be very effective parametric learners of complex functions. However, the non-linearities present in conventional networks are not strong; both halves of a (possibly leaky) RELU are linear and the non-linearity is computed independently for each channel. We'll present techniques that create decision tree and RBF units that are designed to respond non-linearly to complex joint distributions across channels. This makes it possible to pack more non-linearity into a small space and this is a particularly valuable replacement for the latter layers of a network - in particular the solver. The result is hybrid networks that outperform conventional pure neural networks that can be trained orders of magnitude more quickly.