Crosstem Documentation

A comprehensive Python package for morphological analysis combining derivational stemming, inflectional analysis, and cross-lingual etymology.

PyPI version Python 3.7+ License: MIT

Why Crosstem?

Crosstem finds true linguistic roots across part-of-speech boundaries, which is something traditional stemmers and lemmatizers cannot do.

Quick Start

Installation:

pip install crosstem

Basic Usage:

from crosstem import DerivationalStemmer

stemmer = DerivationalStemmer('eng')
print(stemmer.stem('organization'))  # Output: organize
print(stemmer.stem('beautiful'))     # Output: beauty

Key Features

  • Cross-POS derivational stemming: Only library that finds roots across parts of speech

  • Linguistic accuracy: Uses MorphyNet morphological data, not brittle rules

  • Rust-accelerated derivational engine: PyO3 backend with automatic Python fallback

  • 10×+ faster than Porter: Fast graph traversal with hash lookups

  • Etymology tracing: 4.2M relationships across 2,265 languages

  • Word families: Discover complete derivational networks

  • Hybrid runtime: Rust acceleration when available, pure-Python compatibility fallback

  • 15 languages: Multilingual morphology support

Contents:

Indices and tables