SOM -- Self-organizing maps in Go

Go implementation of Self-Organizing Maps (SOM) alias Kohonen maps. Provides a command line tool and a library for training and visualizing SOMs.

Features

Multi-layered SOMs, alias XYF, alias super-SOMs.
Visualization Induced SOMs, alias ViSOMs.
Training from CSV files without any manual preprocessing.
Supports continuous and discrete data.
Fully customizable training and SOM parameters.
Visualization of SOMs by a wide range of flexible plots.
Use as command line tool or as Go library.

Please note that the built-in visualizations are not intended for publication-quality output. Instead, they serve as quick tools for inspecting training and prediction results. For high-quality visualizations, we recommend exporting the SOM and other results to CSV files. You can then use dedicated visualization libraries in languages such as Python or R to create more refined and customized graphics.

Installation

Pre-compiled binaries for Linux, Windows and MacOS are available in the Releases.

Alternatively, install the latest version using Go:
go install github.com/mlange-42/som/cmd/som@latest

Usage

Get help for the command line tool:

som --help

Here are some examples how to use the command line tool, using the World Countries dataset.

Train an SOM with the dataset:

som train _examples/countries/untrained.yml _examples/countries/data.csv > trained.yml

Visualize the trained SOM as heatmaps of components, showing labels of data points (i.e. countries):

som plot heatmap trained.yml heatmap.png --data-file _examples/countries/data.csv --label Country

Export the trained SOM to a CSV file:

som export trained.yml > nodes.csv

Determine the best-matching unit (BMU) for a each row in the dataset:

som bmu trained.yml _examples/countries/data.csv --preserve Country,code,continent > bmu.csv

Available commands

Taken from the CLI help, here is a tree representation of all currently available (sub)-commands:

som          Self-organizing maps command line tool.
├─train      Trains an SOM on the given dataset.
├─quality    Calculates various quality metrics for a trained SOM.
├─label      Classifies SOM nodes using label propagation.
├─export     Exports an SOM to a CSV table of node vectors.
├─predict    Predicts entire layers or table columns using a trained SOM.
├─bmu        Finds the best-matching unit (BMU) for each table row in a dataset.
├─fill       Fills missing data in the data file based on a trained SOM.
└─plot       Plots visualizations for an SOM in various ways. See sub-commands.
  ├─heatmap  Plots heat maps of multiple SOM variables, a.k.a. components plot.
  ├─codes    Plots SOM node codes in different ways. See sub-commands.
  │ ├─line   Plots SOM node codes as line charts.
  │ ├─bar    Plots SOM node codes as bar charts.
  │ ├─pie    Plots SOM node codes as pie charts.
  │ ├─rose   Plots SOM node codes as rose alias Nightingale charts.
  │ └─image  Plots SOM node codes as images.
  ├─u-matrix Plots the u-matrix of an SOM, showing inter-node distances.
  ├─xy       Plots for pairs of SOM variables as scatter plots.
  ├─density  Plots the data density of an SOM as a heatmap.
  └─error    Plots (root) mean-squared node error as a heatmap.

YAML configuration

The command line tool uses a YAML configuration file to specify the SOM parameters.

Here is an example of a configuration file for the Iris dataset. The dataset has these columns: species, sepal_length, sepal_width, petal_length, and petal_width.

som:                      # SOM definitions
  size: [8, 6]            # Size of the SOM
  neighborhood: gaussian  # Neighborhood function
  metric: manhattan       # Distance metric in map space
  visom-metric: euclidean # Distance metric for ViSOM update

  layers:                 # Layers of the SOM
    - name: Scalars       # Name of the layer. Has no meaning for continuous layers
      columns:            # Columns of the layer
        - sepal_length    # Column names as in the dataset
        - sepal_width
        - petal_length
        - petal_width
      norm: [gaussian]    # Normalization function(s) for columns
      metric: euclidean   # Distance metric
      weight: 1           # Weight of the layer

    - name: species       # Name of the layer. Use column name for categorical layers
      metric: hamming     # Distance metric
      categorical: true   # Layer is categorical. Omit columns
      weight: 0.5         # Weight of the layer

training:                 # Training parameters. Optional. Can be overwritten by CLI arguments
  epochs: 2500                        # Number of training epochs
  alpha: polynomial 0.25 0.01 2       # Learning rate decay function
  radius: polynomial 6 1 2            # Neighborhood radius decay function
  weight-decay: polynomial 0.5 0.0 3  # Weight decay coefficient function
  lambda: 0.33                        # ViSOM resolution parameter

See the examples folder for more examples.

License

This project is distributed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.github/workflows		.github/workflows
_examples		_examples
cmd/som		cmd/som
conv		conv
csv		csv
decay		decay
distance		distance
layer		layer
neighborhood		neighborhood
norm		norm
plot		plot
table		table
yml		yml
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
doc.go		doc.go
example_test.go		example_test.go
go.mod		go.mod
go.sum		go.sum
prediction.go		prediction.go
prediction_test.go		prediction_test.go
som.go		som.go
som_test.go		som_test.go
training.go		training.go
training_test.go		training_test.go
util.go		util.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOM -- Self-organizing maps in Go

Features

Installation

Usage

Available commands

YAML configuration

License

About

Releases 2

Languages

License

mlange-42/som

Folders and files

Latest commit

History

Repository files navigation

SOM -- Self-organizing maps in Go

Features

Installation

Usage

Available commands

YAML configuration

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Languages