Glossary

**Uncertainty Quantification**: The process of quantifying uncertainties in model predictions, often by providing confidence intervals, probability distributions, or variance estimates, to improve decision-making and model reliability.

**Underfitting**: A situation where a machine learning model is too simple to capture the underlying patterns in the data, leading to poor performance on both the training set and new data.

**Undirected Graph**: A type of graph where the edges have no direction, meaning the relationships between nodes are bidirectional, often used in network analysis and probabilistic graphical models like Markov random fields.

**Uniform Distribution**: A type of probability distribution where all outcomes are equally likely within a specified range, often used in scenarios where each outcome has an equal chance of occurring.

**Univariate Analysis**: The simplest form of statistical analysis that examines each variable in a dataset independently, often used to describe and summarize the basic features of the data.

**Unsupervised Learning**: A type of machine learning where the model is trained on data without labeled outcomes, learning to identify patterns, groupings, or structures in the data, often used in clustering and dimensionality reduction.

**Update Rule**: In machine learning algorithms, the formula or method used to adjust model parameters during training, such as the weight updates in gradient descent.

**Upsampling**: A technique used to increase the resolution or size of data, often applied in image processing to generate higher-resolution images or in time series analysis to create more frequent data points.

**Utility Function**: A mathematical function used in decision theory and economics to model the preferences of an agent, often used to represent the satisfaction or value derived from different outcomes.

**Universal Approximation Theorem**: A theoretical result that states a neural network with a single hidden layer can approximate any continuous function given enough neurons, highlighting the power of neural networks in function approximation.

**Unbalanced Dataset**: A dataset where the distribution of classes is not even, often leading to biased models that perform poorly on the minority class, commonly addressed with techniques like SMOTE or class weighting.

**U-Net**: A convolutional neural network architecture originally developed for biomedical image segmentation, characterized by its U-shaped structure that allows for precise localization and context integration.

**Uplift Modeling**: A type of predictive modeling used to estimate the incremental impact of a treatment or action on an individual's behavior, often used in marketing to target campaigns more effectively.

**Unsupervised Pretraining**: A technique where a model is first trained in an unsupervised manner to learn general features from data before being fine-tuned on a supervised task, often used to improve performance with limited labeled data.

**User-Based Collaborative Filtering**: A recommendation system approach that suggests items to users based on the preferences of similar users, often used in personalized recommendation systems like those in streaming services or e-commerce.

**Unit Test**: A type of software testing that focuses on verifying the correctness of individual components or functions in a program, ensuring that each part works as intended independently of others.

**Undersampling**: A technique used to balance class distribution in an imbalanced dataset by reducing the number of instances in the majority class, often used to improve model performance on the minority class.

**Uncertainty Propagation**: The process of determining how uncertainties in input variables propagate through a model to affect the uncertainty of the output, often used in sensitivity analysis and risk assessment.

**Unique Path**: In graph theory and combinatorics, the concept of finding a single, distinct path between two nodes or within a network, often used in algorithms related to routing and connectivity.

**Unstructured Data**: Data that does not have a predefined format or structure, such as text, images, or audio, often requiring specialized techniques for processing and analysis in machine learning.

**Univariate Time Series**: A time series that consists of observations on a single variable recorded over time, often analyzed using methods like ARIMA or exponential smoothing for forecasting.

**Upweighting**: The process of giving more importance to certain data points or classes during model training, often used to address imbalances in the data or to emphasize more critical observations.

**Update Equation**: The mathematical formula used in optimization algorithms to update the model parameters iteratively, such as the weight updates in gradient descent or the Bellman equation in reinforcement learning.

**User Embedding**: A representation of users in a lower-dimensional space, often learned in recommendation systems to capture user preferences and similarities, enabling personalized recommendations.