Datasets

Google Colab

[1]:
# For installation of the necessary packages in Google Colab
try:
    import predicode as pc
except:
    # Tensorflow 2.0 must be installed manually in Google Colab
    !pip install tensorflow==2.0.0rc
    !pip install git+https://github.com/sflippl/predicode
    import predicode as pc
# lazytools just contains a few convenience functions, specifically matrix heatmaps,
# but is otherwise not necessary.
try:
    import lazytools_sflippl as lazytools
except:
    !pip install git+https://github.com/sflippl/lazytools
    import lazytools_sflippl as lazytools

Artificial Datasets

Artificial datasets provide a simple example for how the algorithm works and an opportunity to study its analytical solutions.

Decaying Multinormal Distribution

The closed-form solution of a linear predictive coding model is given by a principal components analysis. A multinormal distribution allows for an easy model for such a solution. The class ‘DecayingMultiNormal’ models a high-dimensional input with decaying importance. Namely, the variance of the different principal components is specified using the decay constant ‘alpha’. Dimensionality of the input data is specified using ‘dimensions’ and sample size is specified by ‘samples’.

[3]:
art_data = pc.datasets.decaying_multi_normal(
    dimensions=10, size=10000, alpha=1)
import numpy as np
[4]:
lazytools.matrix_heatmap(art_data, pole = 0)
../_images/usage_datasets_8_0.png
[4]:
<ggplot: (8789233294493)>
[8]:
lazytools.matrix_heatmap(np.cov(art_data.T), pole = 0)
../_images/usage_datasets_9_0.png
[8]:
<ggplot: (8726884562885)>

Image Datasets

Image datasets are predominantly included as examples for the predictive coding algorithms under the ‘datasets’ module. Whereas their main purpose is being incorporated by the respective algorithms, ‘predicode’ allows for some functionality in exploring the datasets on their own. In particular, a number of images may be visualized using the pictures method (see Cifar-10 below).

Cifar-10

Cifar-10 serves as a simple example dataset for basic predictive coding algorithms demonstrating static extraclassical effects.

For now, only the training dataset can be read in using the class Cifar10

[5]:
import predicode as pc
cifar = pc.datasets.Cifar10()

This dataset may be explored by looking at the pictures along with their labels. This is possible in black-white and color.

[6]:
cifar.pictures(subset = range(25), mode = 'bw')
../_images/usage_datasets_17_0.png
[6]:
<ggplot: (8789229089273)>
[7]:
cifar.pictures(subset = range(25), mode = 'color')
../_images/usage_datasets_18_0.png
[7]:
<ggplot: (8789200331389)>

This builds upon the underlying data frame that contains the RGB values for the color and the black-white pictures:

[8]:
cifar.rgb_dataframe(subset = range(1)).head()
[8]:
image_id x y r g b bw rgb rgb_bw
0 0 0 0 59 62 63 61.333333 #3b3e3f #3d3d3d
1 0 1 0 43 46 45 44.666667 #2b2e2d #2c2c2c
2 0 2 0 50 48 43 47.000000 #32302b #2f2f2f
3 0 3 0 68 54 42 54.666667 #44362a #363636
4 0 4 0 98 73 52 74.333333 #624934 #4a4a4a