\name{sumer-package}
\alias{sumer-package}
\alias{sumer}
\docType{package}
\title{Tools for Working with Sumerian Cuneiform Texts}
\description{
Package sumer provides tools for translating and analyzing transliterated Sumerian
cuneiform texts. It converts between transliterations, canonical sign names,
and cuneiform Unicode characters, includes a dictionary lookup system for
translation work, and offers statistical tools for analyzing the grammatical
structure of signs in context.
}
\section{Getting Started}{
Load the package, load a dictionary, and look up your first word:

\preformatted{
library(sumer)

# Load the built-in dictionary
dic <- read_dictionary()

# Look up a Sumerian word
look_up("lugal", dic)

# Search for an English term
look_up("king", dic, "en")
}
}
\section{Cuneiform Conversion}{
Sumerian text can be entered as transliteration (e.g. \code{"lugal"}),
as sign names (e.g. \code{"LUGAL"}), or as cuneiform Unicode characters.
The following functions convert between these representations:

\describe{
\item{\code{\link{as.cuneiform}}}{
Converts transliteration or sign names to cuneiform Unicode characters.
\preformatted{
as.cuneiform("lugal")
as.cuneiform("d-en-lil2")
}
}
\item{\code{\link{as.sign_name}}}{
Converts transliteration to canonical sign names.
\preformatted{
as.sign_name("lugal")
as.sign_name("d-en-lil2")
}
}
\item{\code{\link{info}}}{
Shows all available information about a sign or compound: reading,
sign name, cuneiform character, and alternative readings.
\preformatted{
info("lugal")
info("jic-tukul")
}
}
}
}
\section{Dictionary Lookup}{
The core workflow for translation: load a dictionary and look up words.

\describe{
\item{\code{\link{look_up}}}{
Looks up a Sumerian expression in a dictionary. Forward lookup
(Sumerian to translation) shows the cuneiform form, sign names,
translations with grammatical types, and entries for individual
signs and substrings. Reverse lookup searches for a term in the
translations.
\preformatted{
dic <- read_dictionary()

# Forward: Sumerian -> translation
look_up("d-suen", dic)

# Reverse: translation -> Sumerian
look_up("water", dic, "en")
look_up("Gilgamesh", dic, "en")
}
}
\item{\code{\link{skeleton}}}{
Generates a hierarchical translation template for a Sumerian
sentence. Each word is broken down into syllables and individual
signs, ready to be annotated with translations.
\preformatted{
skeleton("a-ma-ru ba-ur3 ra")
}
}
}
}
\section{Text Analysis}{
These functions help you analyze the statistical and grammatical
structure of a Sumerian text.

\subsection{N-gram Analysis}{
\code{\link{ngram_frequencies}} finds recurring sign combinations in a
text.

\preformatted{
# Use "Enki and the World Order" as an example text
path <- system.file("extdata", "enki_and_the_world_order.txt", package = "sumer")
text <- readLines(path, encoding="UTF-8")
freq <- ngram_frequencies(text, min_freq = 6)
head(freq)
}

\code{\link{mark_ngrams}} puts all these sign combinations in a text in curly brackets:

\preformatted{
text_marked <- mark_ngrams(text, freq)
cat(text_marked[1:10], sep="\n")
}

Find all occurences of a pattern in the annotated text:

\preformatted{
term     <- "IGI.DIB.TU"
(pattern <- mark_ngrams(term, freq))
result   <- text_marked[grepl(pattern, text_marked, fixed=TRUE)]
cat(result, sep="\n")
}

}
\subsection{Grammatical Analysis}{
Each sign in a dictionary can have one or more grammatical types
(e.g. S for noun, V for verb, A for attribute, or operators like
Sx->V). The following functions analyze how signs are used
grammatically.

\code{\link{sign_grammar}} counts how often each grammatical type
occurs for each sign in a string, based on dictionary entries:

\preformatted{
dic <- read_dictionary()
sg  <- sign_grammar("a-ma-ru ba-ur3 ra", dic)
sg
}

For a Bayesian estimate of the statistical distribution of the grammatical types use \code{\link{prior_probs}} and \code{\link{grammar_probs}}:

\preformatted{
prior <- prior_probs(dic, sentence_prob = 0.25)
gp    <- grammar_probs(sg, prior, dic)
}

\code{\link{plot_sign_grammar}} visualizes the result as a stacked
bar chart showing the grammatical type distribution for each sign in a sequence.
It accepts output from either \code{sign_grammar} (raw counts) or
\code{grammar_probs} (probabilities):

\preformatted{
plot_sign_grammar(gp)
plot_sign_grammar(gp, output_file = "grammar.png")
}
}
}
\section{Creating Your Own Dictionary}{
You can build a dictionary from annotated translation files. These
files use a pipe format where each line starts with \code{|} and
contains the sign name, grammatical type, and meaning separated by
colons (e.g. \code{|lugal:S:king}).

\describe{
\item{\code{\link{make_dictionary}}}{
Reads a translation file and converts it to dictionary format in
one step:
\preformatted{
filename   <- system.file("extdata", "text_with_translations.txt",
                          package = "sumer")
dictionary <- make_dictionary(filename)
}

This is equivalent to calling \code{\link{read_translated_text}}
followed by \code{\link{convert_to_dictionary}}.
}
\item{\code{\link{save_dictionary}} / \code{\link{read_dictionary}}}{
Save a dictionary to a file and load it again later:
\preformatted{
save_dictionary(dictionary, "my_dictionary.txt",
                author  = "My Name",
                year    = "2025",
                version = "1.0")

my_dic <- read_dictionary("my_dictionary.txt")
look_up("ki", my_dic)
}
}
}
}
\author{
\strong{Maintainer}: Robin Wellmann \email{ro.wellmann@gmail.com}
}
\seealso{
Conversion:
\code{\link{as.cuneiform}},
\code{\link{as.sign_name}},
\code{\link{info}},
\code{\link{split_sumerian}}

Dictionary lookup:
\code{\link{read_dictionary}},
\code{\link{look_up}},
\code{\link{skeleton}}

Text analysis:
\code{\link{ngram_frequencies}},
\code{\link{mark_ngrams}},
\code{\link{sign_grammar}},
\code{\link{prior_probs}},
\code{\link{grammar_probs}},
\code{\link{plot_sign_grammar}}

Dictionary creation:
\code{\link{read_translated_text}},
\code{\link{convert_to_dictionary}},
\code{\link{make_dictionary}},
\code{\link{save_dictionary}}
}
