Glossary

**Tabular Data**: Data that is organized in a table format with rows and columns, where each row represents an observation and each column represents a feature or variable, often used in traditional machine learning tasks.

**T-distributed Stochastic Neighbor Embedding (t-SNE)**: A dimensionality reduction technique that visualizes high-dimensional data by embedding it in a lower-dimensional space, often used for data exploration and visualization.

**Target Variable**: The variable or outcome that a machine learning model is trained to predict, also known as the dependent variable or label.

**Temporal Difference Learning (TD Learning)**: A reinforcement learning method that updates the value of a state based on the difference between predicted and actual rewards over time, combining aspects of dynamic programming and Monte Carlo methods.

**Tensor**: A multi-dimensional array used in machine learning and deep learning to represent data, such as scalars (0D tensors), vectors (1D tensors), matrices (2D tensors), and higher-dimensional arrays.

**TensorFlow**: An open-source machine learning library developed by Google that provides a platform for building and deploying machine learning models, especially deep learning models.

**Test Set**: A subset of the dataset that is not used during training but is reserved for evaluating the performance of the model, helping to assess its generalization to unseen data.

**Thompson Sampling**: A probabilistic algorithm used in reinforcement learning and multi-armed bandit problems that balances exploration and exploitation by sampling actions according to their probability of being optimal.

**Thresholding**: A technique used in machine learning and image processing where values above or below a certain threshold are assigned to different categories or processed differently, often used in binary classification and edge detection.

**Time Series**: A sequence of data points collected or recorded at successive points in time, often used in forecasting, stock market analysis, and other tasks where temporal patterns are important.

**Tokenization**: The process of breaking down text into smaller units, such as words, subwords, or characters, often used in natural language processing to prepare text for analysis.

**Topic Modeling**: A type of statistical modeling used to discover abstract topics that occur in a collection of documents, with techniques like Latent Dirichlet Allocation (LDA) commonly used in text mining.

**Transfer Learning**: A machine learning approach where a model developed for one task is reused as the starting point for a model on a second, related task, often leading to faster training and improved performance.

**Transformer**: A neural network architecture that relies on self-attention mechanisms rather than recurrent layers, enabling parallel processing of sequence data, widely used in natural language processing tasks.

**True Negative (TN)**: In binary classification, a true negative is an outcome where the model correctly predicts the negative class for an instance that actually belongs to the negative class.

**True Positive (TP)**: In binary classification, a true positive is an outcome where the model correctly predicts the positive class for an instance that actually belongs to the positive class.

**Truncation**: The process of limiting the number of elements or the range of data, often used in machine learning to prevent overly large or small values from dominating the learning process.

**Turing Test**: A test proposed by Alan Turing to determine whether a machine can exhibit intelligent behavior indistinguishable from that of a human, often cited in discussions of artificial intelligence.

**Twin Support Vector Machine (Twin SVM)**: A variant of the standard support vector machine that solves two smaller-sized quadratic programming problems instead of one larger problem, often used for binary classification tasks.

**Type I Error**: Also known as a false positive, it occurs when a hypothesis test incorrectly rejects the null hypothesis, indicating that a relationship or effect exists when it actually does not.

**Type II Error**: Also known as a false negative, it occurs when a hypothesis test fails to reject the null hypothesis when there is actually a true effect or relationship.

**Trapezoidal Rule**: A numerical integration method that approximates the integral of a function by dividing the area under the curve into trapezoids, often used in numerical analysis and machine learning for approximating cumulative functions.

**Tree-Based Model**: A type of machine learning model that uses decision trees as its core structure, including algorithms like Random Forests and Gradient Boosted Trees, often used for both classification and regression tasks.

**Truncated SVD**: A dimensionality reduction technique that approximates the singular value decomposition of a matrix by keeping only a subset of the singular values, often used in latent semantic analysis and recommendation systems.

**Training Set**: The subset of the dataset used to train a machine learning model, where the model learns patterns and relationships from the data to make predictions on new data.

**Trigram**: A sequence of three consecutive elements, often used in natural language processing to model language patterns, such as in n-gram language models for text prediction and analysis.

**Triangular Kernel**: A type of kernel function used in kernel density estimation, which assigns weights to data points based on their distance from a target point, with closer points receiving higher weights.

**Transferability**: The degree to which a trained model or learned features can be applied to a different but related task or domain, often evaluated in transfer learning scenarios.

**Time Complexity**: A measure of the amount of computational time that an algorithm takes to run as a function of the size of the input, often used to evaluate and compare the efficiency of algorithms.

**Top-K Accuracy**: A metric used to evaluate classification models, particularly in multi-class tasks, where the prediction is considered correct if the true label is among the top K predicted labels.

**Two-Sample Test**: A statistical hypothesis test used to determine whether two independent samples come from the same distribution, often used in A/B testing and experimental analysis.

**Tikhonov Regularization**: Also known as ridge regression, a regularization method used to stabilize the solution of ill-posed problems by adding a penalty term to the loss function, often used in regression analysis.

**True Skill Statistic (TSS)**: A metric used to evaluate the performance of a binary classification model, considering both the sensitivity and specificity, often used in ecological modeling and weather forecasting.

**Trust Region Optimization**: An iterative optimization method that restricts the step size of the update within a certain region around the current point, often used in nonlinear optimization to improve convergence stability.

**Target Encoding**: A technique in categorical feature encoding where the categories are replaced with the mean of the target variable, often used to capture the relationship between categorical variables and the target in a supervised learning context.

**Temporal Pooling**: The process of summarizing a sequence of data points over time, often used in sequence models to aggregate information and reduce the temporal dimension.