Glossary

**Imbalanced Data**: A situation in machine learning where the classes in a dataset are not represented equally, often leading to biased models that perform poorly on the minority class.

**Imputation**: The process of replacing missing data with substituted values, often using statistical methods or machine learning models to estimate the missing values.

**Inference**: The process of using a trained machine learning model to make predictions or decisions based on new, unseen data.

**Instance-Based Learning**: A type of learning algorithm that memorizes training instances and makes predictions by comparing new instances to those in the memory, as seen in k-nearest neighbors (k-NN).

**Instance Segmentation**: A computer vision task that involves identifying and delineating each object instance in an image, combining object detection and semantic segmentation.

**Interquartile Range (IQR)**: A measure of statistical dispersion, representing the range within which the central 50% of data points lie, often used to detect outliers.

**Iterative Algorithm**: An algorithm that repeatedly applies a set of rules or calculations until a specific condition is met, often used in optimization and machine learning tasks.

**Isolation Forest**: An anomaly detection algorithm that identifies outliers by isolating data points in a forest of random trees, based on the idea that anomalies are easier to isolate.

**Intelligent Agent**: A system that perceives its environment, makes decisions, and takes actions autonomously to achieve specific goals, often using AI and machine learning techniques.

**Image Augmentation**: A technique used to artificially increase the size of a training dataset by applying transformations such as rotation, flipping, or scaling to the original images.

**Image Captioning**: A task in computer vision and natural language processing where a model generates descriptive text for a given image, combining visual and linguistic information.

**ImageNet**: A large visual database designed for use in visual object recognition research, containing millions of labeled images across thousands of categories.

**Inception Module**: A component of the Inception neural network architecture that allows for efficient multi-scale processing by applying multiple convolutional filters of different sizes to the same input.

**Incremental Learning**: A type of learning where the model is continuously updated with new data without retraining from scratch, allowing the model to adapt to changes over time.

**Independent Component Analysis (ICA)**: A computational technique used to separate a multivariate signal into additive, independent components, often used in signal processing and data analysis.

**Inductive Bias**: The set of assumptions a learning algorithm makes to generalize beyond the training data, often influencing the model's ability to learn effectively.

**Information Gain**: A metric used to evaluate how well a feature splits the data in decision tree algorithms, measuring the reduction in entropy after the data is split based on that feature.

**Instance Weighting**: A technique in machine learning where different instances are assigned different weights during training, often used to handle imbalanced data or emphasize certain examples.

**Inverted Index**: A data structure used in information retrieval systems that maps content, such as words, to their locations within documents, enabling fast search and retrieval.

**Invisible Cloak**: A concept in adversarial machine learning where perturbations are added to an input (like an image) to cause a model to misclassify it while appearing unchanged to humans.

**Isolation Forest**: An unsupervised learning algorithm used to detect anomalies by isolating observations using randomly selected features and split values, effective for high-dimensional datasets.

**Identity Matrix**: A square matrix with ones on the diagonal and zeros elsewhere, often used in linear algebra and as a neutral element in matrix multiplication.

**Information Retrieval (IR)**: The process of obtaining relevant information from a large repository, such as documents or databases, based on user queries, often using techniques from natural language processing and machine learning.

**Interpolation**: The method of estimating unknown values that fall between known data points, often used in data analysis, image processing, and numerical simulations.

**Interactive Learning**: A learning approach where the model interacts with the environment or a human user during the learning process, often used in reinforcement learning and human-in-the-loop systems.

**Intelligent Tutoring System (ITS)**: A computer system that provides personalized instruction or feedback to learners, often using AI techniques to adapt to the learner's needs and progress.

**Intrinsic Dimensionality**: The minimum number of dimensions required to represent the underlying structure of a dataset, often lower than the observed dimensionality due to redundancy or correlations between features.

**Iterative Deepening Search**: A search algorithm that combines the benefits of depth-first search and breadth-first search by incrementally exploring deeper levels of the search tree.

**Imbalanced Classification**: A classification task where the classes are not represented equally in the training data, leading to challenges in building accurate models for the minority class.

**Information Theory**: A branch of applied mathematics that deals with the quantification, storage, and communication of information, often used in machine learning for tasks like feature selection and model evaluation.

**Instance Hardness**: A measure of how difficult it is for a machine learning algorithm to correctly classify a particular instance, often used to identify challenging cases in a dataset.

**Irreducible Error**: The error that cannot be reduced by any model because it arises from the inherent noise or randomness in the data, representing the lower bound of model performance.

**Initialization**: The process of setting the initial values of a model's parameters before training begins, crucial in neural networks to ensure convergence and avoid issues like vanishing or exploding gradients.

**Interpretability**: The degree to which a human can understand the cause of a decision made by a machine learning model, often contrasted with the accuracy or complexity of the model.

**Invariance**: A property of a machine learning model that ensures its predictions remain unchanged under certain transformations of the input data, such as translation or rotation in image recognition.

**Instance Segmentation**: A computer vision task that involves identifying and delineating each distinct object instance in an image, combining object detection and semantic segmentation techniques.

**Interactive Data Visualization**: Techniques that allow users to manipulate and explore data visualizations in real-time, often used to gain insights from complex datasets and support decision-making.

**Image Restoration**: The process of recovering a clean image from a degraded version, often using machine learning techniques to remove noise, blur, or other distortions.

**Irregular Data**: Data that does not follow a regular structure or pattern, often requiring specialized techniques for processing and analysis in machine learning tasks.

**Image Super-Resolution**: The process of enhancing the resolution of an image using machine learning techniques, often by predicting high-resolution details from low-resolution inputs.

**Inverse Reinforcement Learning (IRL)**: A type of reinforcement learning where the goal is to infer the reward function that an observed agent is optimizing, often used in scenarios where the reward is not directly observable.

**Invariant Representation**: A representation of data that remains unchanged under certain transformations, often used in computer vision and signal processing to improve model robustness.

**Isomap**: A nonlinear dimensionality reduction technique that preserves the geodesic distances between points in a high-dimensional space, often used for visualizing complex data structures.