Data Science and AI Integration

Experimental, AI‑assisted, data‑driven methodologies integrated into engineering platforms and supported by semiconductor, statistical, machine‑learning, and deep‑learning technologies to optimize semiconductor manufacturing across process, device, and yield development. The following are the key components of my work on AI‑Driven Engineering Platforms:

  • AI-assisted software: AI-agent
  • AI-assisted data analysis: yield analysis enabling yield-aware design and yieldable process/device
    • Machine Learning: PCA, SVM, Bayesian Optimization
    • Deep Learning: time-series data
  • Statistical data analysis: Gaussian, Poisson, Order statistics, Extreme Value Distribution
  • (Semiconductor) Technology-based analysis: Device physics, Small circuit simulation, Error propagation, Monte Carlo Simulation, DOE/RSM, Split-CV, Dielectric Conduction, Variability, BKM management, Soft/hard yield
  • Full-stack web platform using WordPress, Flask, or Next.js

The motivation for technology convergence that integrates semiconductor technology with data science is that this convergence is essential for technology-aware software that enables:

  1. Advancing semiconductor technology
  2. Improving engineers’ productivity
  3. Creating a more fulfilling work environment.

A technology‑aware software engineer can deliver this integration effectively, since domain‑aware development fits well with Agile and DevOps practices.

A technology‑aware software tool provides several key benefits:

  1. Helps engineers quickly learn the legacy knowledge from previous technologies
  2. Enables engineers to absorb leading‑edge technology more effectively
  3. Speeds up computational workflows
  4. Ensures work is performed in a standardized manner
  5. Standardizes data by serving as a de facto specification
  6. Needs continuous improvement as the technology evolves, with pros and cons.

Applied Statistics

AI-assisted Semiconductor Development ….


Related Posts below (or view All Articles)

Categories = “Data Science, AI-powered, Applied Statistics”

The 4-Axis Matrix for ML Paper Study: A Case Study on Within-Wafer Variation Prediction
Data Science

The 4-Axis Matrix for ML Paper Study: A Case Study on Within-Wafer Variation Prediction

By Wolf
Created: 2026.04.24 | Modified: 2026.04.24
Reading ML papers often raises a recurring question: where does the true novelty of this paper actually live? A paper may claim a “new framework” while being a recombination of…
Read More
ML Methodology: Taxonomy for Within-Wafer Variation Prediction
Data Science | AI-powered | Semiconductor

ML Methodology: Taxonomy for Within-Wafer Variation Prediction

By Wolf
Created: 2026.04.24 | Modified: 2026.04.24
the far side of the Moon This report presents a structured taxonomy of machine learning methodologies specialized for Within-Wafer (WIW) variation prediction in semiconductor manufacturing. General-purpose ML approaches often fail…
Read More
Physics-Informed Machine Learning: Integrating Physical Laws and Domain Knowledge into Numeric AI/ML
Data Science | Evaluation Metric | Training

Physics-Informed Machine Learning: Integrating Physical Laws and Domain Knowledge into Numeric AI/ML

By Wolf
Created: 2026.04.18 | Modified: 2026.04.19
The Core Principle of PIML: Loss Function $$ { \mathcal{L}_{total} = \mathcal{L}_{data} + \lambda \cdot \mathcal{L}_{phys} } $$ Experience(Observations) + Reason(First Principles) Numeric data-driven AI/ML models are powerful when abundant…
Read More
Post-hoc Prediction Correction: Long-Term Bias & Short-Term Drift
Data Science | Test | Training

Post-hoc Prediction Correction: Long-Term Bias & Short-Term Drift

By Wolf
Created: 2026.04.18 | Modified: 2026.04.18
A trained regressor that performs well on historical data often drifts in production for two distinct reasons: Retraining is the principled fix for both, but it is expensive and sometimes…
Read More
One-Hot Encoding Pitfalls and Countermeasures
Feature Engineering | Data Science

One-Hot Encoding Pitfalls and Countermeasures

By Wolf
Created: 2026.04.16 | Modified: 2026.04.17
1. What is One-Hot Encoding? One-hot encoding is the most fundamental technique for converting categorical variables into numerical vectors that machine learning models can process. Given N unique categories, each…
Read More
Balancing Model Sensitivity and Explainability (R²)
Data Science | Training

Balancing Model Sensitivity and Explainability (R²)

By Wolf
Created: 2026.04.16 | Modified: 2026.04.16
Aerial view of the Great Blue Hole in Belize In the model training process, the responsiveness (sensitivity) of $Y$ to variations in $X$ is a core factor that determines the…
Read More
Impact of Target Variable Ranges on Model Performance
Data Science | Feature Engineering

Impact of Target Variable Ranges on Model Performance

By Wolf
Created: 2026.04.16 | Modified: 2026.04.16
1. Analysis of Small Positive Ranges (0 to 0.2) While a target range of 0 to 0.2 is mathematically valid, it presents several practical challenges in model training and optimization.…
Read More
Addressing Random Seed Sensitivity in Feature Selection: A Survey of Methods and Recent Advances (2025–2026)
Data Science | Feature Engineering

Addressing Random Seed Sensitivity in Feature Selection: A Survey of Methods and Recent Advances (2025–2026)

By Wolf
Created: 2026.04.15 | Modified: 2026.04.16
Delicate dandelion seed heads Random Seed Sensitivity in Feature Selection 1. Problem Statement Feature selection results can vary significantly depending on the random seed used during model training, data splitting,…
Read More
Center Alignment Index (CAI): A Novel Metric for Evaluating Data Center Agreement on the 1:1 Line
Evaluation Metric | Data Science

Center Alignment Index (CAI): A Novel Metric for Evaluating Data Center Agreement on the 1:1 Line

By Wolf
Created: 2026.04.09 | Modified: 2026.04.10
Introduction to Center Alignment Index (CAI) as a Regression Metric 1. Mathematical Definition and Components The Center Alignment Index (CAI) is a bounded metric designed to quantify how closely the…
Read More
Lin’s Concordance Correlation Coefficient (CCC) in AI/ML
Data Science | Evaluation Metric

Lin’s Concordance Correlation Coefficient (CCC) in AI/ML

By Wolf
Created: 2026.04.09 | Modified: 2026.04.09
Introduction to Lin’s Concordance Correlation Coefficient (CCC) In the fields of Artificial Intelligence (AI) and Machine Learning (ML), evaluating model performance typically revolves around accuracy, precision, or error metrics. However,…
Read More
The Impact of Variance Components on the Coefficient of Determination ($R^2$)
Data Science | Evaluation Metric

The Impact of Variance Components on the Coefficient of Determination ($R^2$)

By Wolf
Created: 2026.04.08 | Modified: 2026.04.09
1. Executive Summary The Coefficient of Determination, denoted as $R^2$, is one of the most widely used metrics for assessing the goodness-of-fit in linear regression models. However, its interpretation is…
Read More
Mean, Variance, and Agreement Metrics for Regression in AI/ML
Data Science | Evaluation Metric

Mean, Variance, and Agreement Metrics for Regression in AI/ML

By Wolf
Created: 2026.04.08 | Modified: 2026.04.10
1. Executive Summary In advanced engineering domains—such as semiconductor manufacturing, virtual metrology, and multi-sensor time-series analysis—the validation of predictive models requires more than a single performance score. We must distinguish…
Read More