Cognition and State Modeling

sentence-transformers

Dense semantic embeddings for retrieval, similarity, clustering, and reranking.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Convert text into dense vectors for semantic search, matching, and clustering.

InputText / sentence list

OutputVector embeddings / similarity scores

Trigger TimingTriggered on demand after the required input files and configuration are prepared.

RuntimePython / PyTorch

BeforeText / sentence list

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterVector embeddings / similarity scores

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Preset Example

A quick-run style example for the documentation page.

Inputtools/sentence-transformers/examples/sentences.txt

Promptmodel: all-MiniLM-L6-v2

ExpectedEmbedding matrix and optional similarity scores.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

modeltextall-MiniLM-L6-v2

Sentence-transformers model identifier.

inputpath

Path to text lines or JSON records.

normalizetoggletrue

Whether to L2-normalize vectors for cosine search.

Output Explanation

embeddings

Dense vectors for each input text item.

similarity_matrix

Optional pairwise semantic similarity output.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Resources

GitHubhttps://github.com/UKPLab/sentence-transformers Code Downloadhttps://github.com/UKPLab/sentence-transformers/archive/refs/heads/master.zip arXivhttps://arxiv.org/abs/1908.10084 Model Cataloghttps://www.sbert.net/docs/sentence_transformer/pretrained_models.html

Deployment Notes

Install sentence-transformers and compatible torch versions.
Download model from Hugging Face on first run or pre-cache offline.
Run encoding wrapper with repository-relative input paths.
Store vectors in tools/sentence-transformers/runs/ for retrieval tooling.

Relative Path Example

python tools/sentence-transformers/run.py --model all-MiniLM-L6-v2 --input tools/sentence-transformers/examples/sentences.txt --out tools/sentence-transformers/runs/embeddings.npy

Expected Result Shape

{
  "tool": "sentence-transformers",
  "status": "ok",
  "results": [
    {
      "label": "Sentence embedding",
      "score": 0.87,
      "output": "Vector embeddings / similarity scores"
    }
  ],
  "timing": {
    "runtime": "The official SBERT table reports 14,200 sentences/sec on V100, with an 80 MB model, 384-dimensional embeddings, and max sequence length 256.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/sentence-transformers/runs/visualization.png",
    "raw_predictions": "tools/sentence-transformers/runs/predictions.json"
  }
}

Paper figure

Academic Info

Paper identity and contribution summary.

TitleSentence-BERT: Sentence Embeddings using Siamese BERT-Networks

AuthorsNils Reimers, Iryna Gurevych

VenueEMNLP-IJCNLP 2019 / arXiv:1908.10084

ContributionEnables efficient semantic similarity and retrieval with reusable sentence embeddings.

Citation

@misc{sentencetransformers2019,
  title={Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks},
  author={Nils Reimers and Iryna Gurevych},
  year={2019},
  note={EMNLP-IJCNLP 2019 / arXiv:1908.10084},
  url={https://arxiv.org/abs/1908.10084}
}

Benchmark

Only compact, source-reported numbers are shown here.

Dataset	Metric	Value	Runtime	Source
SBERT model table, 14 sentence-embedding datasets	all-MiniLM-L6-v2 sentence performance	68.06	14,200 sentences/sec on V100	Official SBERT pretrained model table
SBERT model table, 6 semantic-search datasets	all-MiniLM-L6-v2 semantic-search performance	49.54	80 MB model, 384 dimensions, max sequence length 256	Official SBERT pretrained model table
SBERT model table	Training scale	1B+ training pairs	Mean pooling, normalized embeddings	Official SBERT pretrained model table

Artifacts

Official SBERT pretrained model table, model card link, pretrained models, evaluation scripts, and embedding outputs.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.