Deep learning-based single-shot computational spectrometer using multilayer thin films

Bhatti, David S.; Lee, Jioh; Kim, Cheolsun; Choi, Youngin; Yoon, Hoon Hahn; Lee, Heung-No

doi:10.1038/s41598-025-06691-6

Download PDF

Article
Open access
Published: 01 July 2025

Deep learning-based single-shot computational spectrometer using multilayer thin films

David S. Bhatti¹,
Jioh Lee²,
Cheolsun Kim³,
Youngin Choi²,
Hoon Hahn Yoon⁴ &
…
Heung-No Lee¹

Scientific Reports volume 15, Article number: 21232 (2025) Cite this article

325 Accesses
Metrics details

Subjects

Abstract

Computational spectrometers hold significant potential for mobile applications, such as on-site detection and self-diagnosis, due to their compact size, fast operation time, high resolution, wide working range, and low-cost production. Although extensively studied, prior demonstrations have been confined to a few examples of straightforward spectra. This study demonstrates a deep learning (DL)-based single-shot computational spectrometer capable of recovering narrow and broad spectra using a multilayer thin-film filter array. Our device can measure spectral intensities of incident light by combining a filter array, fabricated using wafer-level stencil lithography, with a complementary metal-oxide-semiconductor (CMOS) image sensor through a simple attachment. All the intensities were extracted from a monochrome image captured with a single exposure. Our DL architecture, comprising a dense layer and a U-Net backbone with residual connections, was employed for spectrum reconstruction. The measured intensities were input into the DL architecture to reconstruct the spectra. We collected 3,223 spectra, encompassing both broad and narrow spectra, using color filters and a monochromator to train and evaluate the proposed model. We reconstructed 323 test spectra, achieving an average root mean squared error of 0.0288 over a wavelength range from 500 to 850 nm with a 1 nm spacing. Additionally, the proposed multilayer thin-film filters were validated through scanning electron microscope (SEM) analysis, which confirmed uniform layer deposition and a high fabrication yield. Our computational spectrometer boasts a compact design, a rapid measurement time, a high reconstruction accuracy, a broad spectral range, and CMOS compatibility, making it well-suited for commercialization.

Mass production-enabled computational spectrometers based on multilayer thin films

Article Open access 08 March 2022

Long-wave infrared computational multispectral metasurface and spectral reconstruction method

Article Open access 01 July 2025

Compact angle-resolved metasurface spectrometer

Article 02 November 2023

Introduction

Spectrometers are powerful tools used in scientific research and industrial settings for chemical analysis¹, remote sensing², and other applications. Although spectrometers are used in various applications, they are typically confined to static environments, such as laboratories and factories, due to their bulky size, long operational times, and high costs. Due to these restrictions and practical application requirements, optical filter-based spectroscopy is emerging as a promising technique. A configuration of a filter-based spectrometer could be realized by attaching a filter array to a complementary metal-oxide-semiconductor (CMOS) image sensor. Unlike grating-based spectrometers, which require diffractive optics and motorized components, a low-cost and compact design can be realized. However, numerous narrow-bandpass filters are required to cover a wide wavelength range with high resolution. Fabricating such delicate filters is challenging, as it is complicated to integrate them into small CMOS sensor areas in an array form.

Instead of enhancing the resolution of spectrometers through the elaboration of optical filters, computational approaches have been employed in conventional spectrometers to improve resolution^3,4. Additionally, advanced computational spectrometers that employ compressed sensing (CS) theory^5,6,7 have been proposed to increase resolution further, while reducing the number of filters required^8,9. To reconstruct the unknown spectrum of incident light, these spectrometers primarily adopt iterative numerical optimization methods^10,11,12, leveraging the sparsity of light sources or their sparse representation on a particular basis. Owing to the sparse nature of signals and filter design technique under CS theory, these computational spectrometers have achieved a 7-fold improvement in spectral resolution⁹. To realize these improvements, various spectral encoders, such as quantum dot filters^13,14, etalon filters^15,16, photonic crystal slabs^17,18, van der Waals junctions^19,20,21, nanowires²², and multilayer thin films (MTFs)^23,24 have been proposed. Unlike conventional spectrometers that selectively measure specific light wavelengths, these advanced photonic structures are designed with unique transmission functions, enabling the measurement of a broad wavelength range. By adopting these photonic structures, it is possible to cover a broad wavelength spectrum with a minimal number of measurements, thereby achieving a compact device size.

Recent advances in signal reconstruction algorithms based on DL²⁵ have significantly impacted the field of CS. By adopting CS theory, sensing systems achieve faster measurements and reduced costs. Traditionally, these advantages were hindered by slow reconstruction speeds due to the reliance on complex numerical optimization algorithms. However, DL has successfully addressed these issues by replacing traditional reconstruction methods with neural networks that can efficiently reconstruct original signals^26,27,28. The integration of the CS and DL frameworks has led to improved performance and reduced computational complexity.

In this study, we propose a DL-based single-shot computational spectrometer for recovering both narrow and broad spectral ranges. As a hardware configuration of the spectrometer, we employed an MTF filter array consisting of 36 filters and a CMOS camera. To train, validate, and test our DL architecture, we collected 3,223 spectra with abundant spectral features, including both narrow and broad spectra, using combinations of color filters and a monochromator. After training, we evaluated the performance using the data that was not used in the training or validation process. The average root mean squared error (RMSE) for the test set was 0.0288. The experimental results demonstrate that the proposed DL architecture effectively reconstructs both narrowband quasi-monochromatic and broadband mixed spectra, simultaneously reconstructing them without introducing biases. The proposed spectrometer is compact, single-shot, and suitable for mass production. Furthermore, applying the DL technique can offer a high resolution, a wide working range, and fast measurement. Therefore, the proposed spectrometer can serve as a new form factor for on-site detection, such as in drink inspection, counterfeit document detection, and self-diagnosis.

The main contributions of this paper are outlined as follows:

1.
A DL-based computational spectrometer utilizing an MTF filter array in a single-shot structure is presented, offering a compact size, high resolution, wide working range, and fast reconstruction, leveraging a mass-producible MTF filter array.
2.
A diverse spectral dataset is developed to train and evaluate the DL architecture, demonstrating effective performance for reconstructing both narrow and broad spectra simultaneously.
3.
This study extends the previous work presented by Kim et al.²⁴ for spectral reconstruction.

The rest of the article is organized into multiple sections such as, “Problem restatement” , “Related work” , “Methods” , “Experimentation”, “Results and discussions”, “Comparison with conventional and modern spectrometers”, “Open research challenges”, “Conclusion”.

Problem restatement

This study addresses the challenge of reconstructing spectral information from the intensities measured using an MTF filter array attached to a CMOS image sensor. The problem at hand is classified as an underdetermined system represented by ${y} = {A} {x} + {n}$, where the number of wavelengths exceeds the number of filters used, making it challenging to reconstruct the incident spectrum ${\bf{x}}$ from the measured intensities ${\bf{y}}$. Traditional methods²⁹ may struggle to provide unique solutions due to this limitation. To address this challenge, we consider various techniques, including regularization (L1 and L2 regularization, Tikhonov regularization, and total variation regularization)^30,31, optimization algorithms (gradient descent, the Adam optimizer, proximal gradient methods, alternating least squares, and coordinate descent)^24,32, and advanced DL-based reconstruction methods. Regularization guides solutions toward plausible outcomes, while optimization algorithms minimize a cost function to approximate unknown quantities. We focus on DL for its strong capacity to model complex data relationships. Convolutional neural networks (CNNs) excel in capturing patterns across wavelengths, enabling full-spectrum reconstruction from limited data. We propose a dense layer and U-Net architecture with residual connections to address this underdetermined problem, aiming for high accuracy, compact size, and fast operation, which is suitable for mobile and on-site spectrometry.

Related work

In this section, recent advances in computational spectrometers are introduced by focusing on their photonic structures, reconstruction methods, and demonstrations.

Zhang et al.³³ introduced a promising approach to computational spectroscopy by integrating a metasurface-based spectrometer with DL algorithms. Their snapshot computational spectroscopy tool demonstrated significant advancements in terms of size, speed, and versatility, with a compact design (100 $\times$ 50 $\times$ 50 mm³) and sub-nanometer resolution. The study reported spectral reconstruction accuracy of 99.4%, spectral resolution of 0.4 nm, and measurement error of 0.32 nm, which is less than its spectral resolution. In this context, this method is a competitive alternative to traditional spectrometers. However, the reliance on extensive experimental spectral data for DNN training, combined with the limited operational range from 400 to 900 nm, poses challenges to its broader applicability. Despite these constraints, the framework provides a solid foundation for developing portable and efficient spectroscopic tools, particularly when extended to ultraviolet (UV) and infrared (IR) applications and optimized for encoder designs. Overall, this study represents a critical step toward the development of miniaturized, rapid, and accurate spectroscopic technologies. However, its scalability and applicability in diverse real-world scenarios remain areas for further exploration.

Bian et al.³⁴ proposed an advanced on-chip computational hyperspectral imaging solution, which combines a broadband multispectral filter array with a spectral reconstruction network. The broadband multispectral filter array enhances light throughput by modulating incident light across a broad spectral range, thereby improving performance in low-light and long-distance imaging applications. The spectral reconstruction network efficiently reconstructs hyperspectral data cubes from compressed measurements, offering high-resolution in real-time hyperspectral imaging. The proposed sensors cover spectral ranges from 400 to 1000 nm or from 400 to 1700 nm, and achieve excellent spectral resolution and high light throughput. These sensors outperform traditional systems, especially in low-light conditions. Practical applications, including agriculture, health monitoring, and industrial automation, showcase the sensor’s high signal-to-noise ratio and broadband capabilities. Despite the promising performance, challenges such as the complex fabrication process and the need for computational resources remained unaddressed by the authors. Still, the sensors provide a compact and efficient solution for next-generation hyperspectral imaging, with applications in diverse fields.

Chen et al.³⁵ proposed an ultra-simplified computational spectrometer in their study, which employs a one-to-broadband diffraction mapping strategy using an arbitrarily shaped pinhole as the partial disperser. Their design eliminates the need for complex pre-encoding, calibration, and high-precision fabrication. It achieved spectral peak ___location accuracy better than 1 nm over a 200 nm bandwidth and a resolution of 3 nm for a bimodal spectrum. The compact spectrometer provides single-shot spectrum measurements across a broad wavelength range, making it ideal for mobile applications. It also addressed a breakthrough in broadband coherent diffractive imaging, overcoming challenges like unknown illumination spectra and detector quantum efficiency corrections. The authors’ proposed solution offers a low-cost, robust solution with great potential for broadband spectrum metrology and computational imaging.

Yako et al.³⁶ proposed a video-rate hyperspectral camera that combines CS with CMOS-compatible Fabry–Pérot filters to overcome the limitations of traditional HS imaging systems, such as low sensitivity and resolution. The proposed system achieves a sensitivity of 45% for visible (VIS) light, a spatial resolution of 3 pixels at a 3 dB contrast, and a frame rate of 32.3 frames per second (fps) at VGA resolution, comparable to standard RGB cameras. AI-based image reconstruction further accelerates the frame rate to 34.4 fps at full HD resolution. This innovation offers a compact, efficient, and high-performance solution for real-world high-speed (HS) imaging, with potential applications in consumer devices, including smartphones and drones. The system’s reliance on iterative reconstruction may pose computational challenges, necessitating further optimization for widespread adoption. Compressive sensing method for hyperspectral image reconstruction, leveraging a fast iterative shrinkage-thresholding algorithm (FISTA) for efficient recovery is observed in literature³⁷. FISTA enables improved reconstruction quality over conventional wavelet methods, especially when combined with patch-based encoding and randomized matrices¹².

Tan et al.³⁸ proposed utilizing smartphones as speckle spectrometers to achieve good results with minimal hardware. The authors’ proposed model reconstructs VIS-wavelength spectra from 470 to 670 nm within a second using a mobile computing app by injecting light into an optical fiber and capturing the resulting speckle patterns with a smartphone camera, utilizing a concept of a reversed lens. Their technique can resolve single and multi-peaked spectra, including metameric pairs. The smartphone-based spectrometer, though smaller in magnification compared to traditional microscope objectives, offers broadband operation and a resolution of 2 nm. The setup requires several components, including a fiber coupler, optical fiber, and a reversed lens, which allows it to fit within a compact module. While the technique does not surpass current grating spectrometers in performance, it introduces an alternative sensing method with the potential for further development and portability. The initial calibration may be labor-intensive, but the system shows significant hope for overcoming traditional spectrometer size limitations.

Bielczynski et al.³⁹ presented a novel, portable, handheld Vis–NIR spectrometer capable of non-invasive plant pigment quantification, showing promise for advanced precision agriculture. The device demonstrated impressive accuracy in estimating anthocyanin and chlorophyll contents, achieving correlation coefficients of 0.84 and 0.77, respectively, with conventional gold-standard methods. The authors’ proposed design integrates wireless data transfer and dual preprogrammed methods for pigment quantification, offering cost-effectiveness and ease of use. Their results validate the spectrometer’s reliability for indoor applications and its potential for routine plant health monitoring. But, the performance under outdoor conditions remains to be validated. Moreover, the limited spectral range restricts its utility to specific vegetation indices, leaving room for further improvements to broaden its capabilities. This work represents a significant step toward developing affordable and portable plant monitoring tools, but highlights the need for expanded applications and rigorous field testing to facilitate widespread adoption.

Huang et al.¹⁵ introduced a computational spectrometer using etalon filters. They combined a 10 $\times$ 10 array of etalon filters with a CCD array as the hardware configuration. By varying the thickness of the cavity layer, each etalon filter achieved a unique transmittance pattern. The L1-norm minimization method¹⁰ was employed to reconstruct the input spectra from measurements of the etalon array. The results were demonstrated by reconstructing the transmitted light from bandpass filters and the spectra from the laser source.

Wang et al.¹⁷ utilized a photonic crystal slab to configure spectral encoders. Unique transmittance patterns were obtained by varying structural parameters, including slab size, lattice shape, and the distance between holes. The input spectra were reconstructed by minimizing regularized squared error with non-negativity constraints. They demonstrated the efficacy of their method by reconstructing monochromatic lights such as LEDs, HeNe lasers, and the outputs of a monochromator.

Li et al.¹³ employed quantum dot filters, varying the ratio of oleic acid to precursor, to achieve unique transmittance patterns. The Total Variation (TV) algorithm was used to reconstruct the input spectra. They demonstrated their results by reconstructing the reflected light from objects such as a watermelon, grape, and spinach. Similarly, Li et al.⁴⁰ improved the CASSI system with a dual-camera design, utilizing structural information from a grayscale camera along with TwIST¹¹ and TV regularization to enhance hyperspectral image reconstruction. This approach significantly improved image quality and reduced runtime, achieving a PSNR gain of 8.99 dB, a structural similarity (SSIM) increase of 0.0757, and a spectral angular mapper (SAM) reduction of 0.1987.

Kim et al.^23,24 designed MTFs with unique transmittance patterns by randomly omitting the intermediate layers from a reference MTF with 19 layers. The L1-norm minimization method¹⁰ was utilized to recover input signals. Their demonstrations included reconstructing monochromatic lights from LEDs and outputs of a monochromator. Additionally, in²⁴, they showed hyperspectral imaging of an LED matrix using a pinhole camera model. Despite these advancements, demonstrations of computational spectrometers have been limited to a narrow range of examples and types, including monochromatic lights, LEDs, and laser sources. Still, the proposed MTF filter array, with a compact footprint of 4.5 $\times$ 4.5 mm² and operating in the 500–850 nm wavelength range, enables hyperspectral imaging (HSI) in miniaturized devices, such as Magnetic-Assisted Capsule Endoscopy (MACE). Integrated with a CMOS sensor, this MTF filter array allows selective spectral filtering for tissue differentiation. It supports computational algorithms like SAVE (Spectrum-Aided Visual Enhancer) to convert White Light Images (WLI) into enhanced images similar to Narrow Band Imaging (NBI), enabling improved mucosal visualization and early cancer detection in resource-constrained systems^41,42.

The numerical optimization methods^10,11,12 used for spectra reconstruction assume that all signals in nature are sparse or can be sparsely represented in a specific ___domain. However, not all spectra can be sparsely represented using a fixed sparsifying basis, resulting in limitations to their representation capability. Moreover, these approaches perform well for precisely measured signals and handcrafted parameters predetermined through prior information, such as spectral sensitivities, sparsifying bases, line shapes, and full widths at half maximums (FWHMs) of spectra. Thus, the reconstruction performance of a spectrum could be biased depending on the variations in noise levels and predetermined parameters. These limitations hinder the use of computational spectrometers for accurately recovering various waveforms of spectra.

To mitigate these issues, researchers have employed DL methods as alternatives to numerical optimization methods for computational spectrometers. Kim et al.⁴³ utilized a convolutional neural network (CNN) to recover input spectra from measurements taken by a proposed MTFs-based spectrometer. They trained their network using a synthetic dataset created by combining multiple Gaussian distribution functions with varying FWHMs, peak locations, and amplitudes. Their simulation demonstrated that the reconstruction results of the proposed method outperformed conventional numerical optimization methods¹⁰. In a subsequent study⁴⁴, they introduced ResCNN to improve the reconstruction results further. They trained and evaluated the network using synthetic datasets with Gaussian and Lorentzian distribution functions, as well as spectral datasets such as the US Geological Survey (USGS) spectral library version 7⁴⁵ and the Munsell colors glossy spectral dataset⁴⁶. Performance was validated through simulation experiments. Wen et al.⁴⁷ proposed a DL-based spectrometer using dielectric films. They trained the reconstruction network with a synthetic dataset featuring Gaussian distribution functions and spectral datasets^48,49. Their reconstruction performance was demonstrated with simulated spectra and monochromatic lights. Brown et al.⁵⁰ proposed a DL-based spectrometer using plasmonic encoders. Unlike the methods proposed by Kim et al.^43,44, they did not use prior knowledge of filter functions to initialize the first layer of the reconstruction network. To train and evaluate their spectrometer, they collected an experimental dataset using a programmable supercontinuum laser, comprising 60,644 pairs of spectra with varying numbers of spectral peaks and CS measurements from the proposed spectrometer. Initially, the spectrometer performed well for 15 hours of continuous experimentation; however, its performance degraded over time due to environmental variations. To address this issue, they captured additional data pairs and applied transfer learning techniques. For DL-based computational spectrometers, demonstrating the reconstruction results of raw measurements can be hindered by ___domain shift, which occurs when the distribution of simulated measurements differs from that of real measurements. This discrepancy arises because the datasets used to train reconstruction networks do not include the actual measurement noise of sensors when measuring input lights. Thus, the difference between training datasets and real measurements has a significant impact on the generalizability of the reconstruction network.

Zhang et al.^51,52 combined DL and CS techniques to enhance the performance of quantum dot spectrometers. In⁵¹, instead of directly applying a neural network to recover the input signal, they first used numerical optimization algorithms to recover the signals. Then they applied their neural network to refine the results. Conversely, in⁵², a neural network was used to denoise the CS measurements before applying numerical optimization to recover the input signals. They trained and evaluated the neural network using a spectral dataset⁴⁶ and an experimental dataset comprising 704 pairs of measurements from the proposed spectrometer and spectra obtained using various combinations of colored plastics. By integrating DL into the CS framework, they improved the results of CS recovery algorithms. However, the inherent limitations of numerical optimization methods remain.

In this work, to address the issues mentioned above, we propose a DL-based single-shot computational spectrometer using MTFs. Our DL architecture is trained on a dataset of 3223 spectra and raw measurements obtained through various combinations of color filters and a monochromator, resulting in an abundance of spectral features. Our experiments demonstrate that the proposed DL architecture effectively reconstructs both narrow and broad spectra, benefiting from the richness of our dataset. Additionally, we showcase the reconstruction of transmitted spectra through commercial drinks, highlighting the spectrometer’s suitability for on-site detection applications such as drink inspection, counterfeit document detection, and self-diagnosis.

Methods

The proposed method comprises three main components: fabrication of the MTF filter, image acquisition, signature reconstruction, and identification using DL.

Fabrication of the MTF filter array

Unlike the etalon filters (see “Related work”), which are fabricated by varying the thickness of interspacing dielectric layers, MTF filters are fabricated by modifying both the number of layers and their thicknesses. MTF filters with unique transmission functions are achieved by alternately depositing two materials with varying thicknesses. A total of 36 MTF filters with distinct transmission characteristics were fabricated by selectively omitting specific layers using shadow masks during the deposition process. The omission of an intermediate layer causes the upper and lower layers to merge, forming a single layer with a modified thickness. The designed layer thicknesses for the MTF filters are provided in Table 1. The filter array was fabricated using wafer-level stencil lithography with shadow masks, which allows for the stacking of alternating layers of titanium dioxide (TiO₂) and silicon dioxide (SiO₂), enabling scalable, reproducible, and mass-produced fabrication. TiO₂ and SiO₂ act as the low and high refractive index materials, respectively. The refractive indices for TiO₂ and SiO₂ are approximately 2.6 and 1.45 at 600 nm, respectively. As a result, 169 identical filter arrays were fabricated on a single wafer²⁴.

These films were deposited onto a borosilicate glass wafer, which has a refractive index of approximately 1.472 at 588 nm, using shadow masks to define specific areas for deposition. The TiO₂ layers were deposited using direct current (DC) magnetron sputtering, a technique for depositing thin films onto a substrate by bombarding a target material with high-energy ions, causing atoms to be ejected and deposited as a thin layer on the substrate^53,54,55,56. A Ti target was sputtered in a mixed gas flow of 188 sccm (standard cubic centimeters per minute) argon (Ar) and 12 sccm oxygen (O₂), with the DC power set to 700 W. Shadow masks ensured that TiO₂ was deposited only on designated regions of the wafer. For the SiO₂ layers, radiofrequency (RF) magnetron sputtering was used. A Si target was sputtered in a gas flow of 185 sccm Ar and 15 sccm O₂, with RF power at 300 W. Different shadow masks were used to control the deposition patterns and thicknesses of the SiO₂ layers. The alternating deposition of TiO₂ and SiO₂ layers was repeated 17 times, with shadow masks changed accordingly to achieve the desired multilayer structure.

Individual depositions of TiO₂ were performed ten times, while SiO₂ was deposited nine times. Following the completion of thin film deposition, the surface of the thin films was coated with a photoresist. Germanium (Ge) was then deposited across the entire wafer area using an e-beam evaporator. The lift-off process was carried out by soaking the deposited wafer in acetone, allowing the photoresist to dissolve. As the photoresist was removed, the Ge layer deposited on top of it was lifted off and washed away. This process resulted in the formation of a square Ge grid with a size of 400 $\upmu$m and a spacing of 300 $\upmu$m. The Ge grid was designed to separate the MTF filters and prevent incident light from interfering with them. Finally, the wafer underwent a cleaning process before being diced to produce the MTF filter arrays.

After fabrication, the filters with unique transmission functions were obtained by stacking multiple layers of thin films with varying numbers and thicknesses. These carefully designed MTF filters enable the capture of broad-spectrum light across a wide range of wavelengths. Integrated with a complementary metal-oxide-semiconductor (CMOS) sensor, the MTF filter array serves as a fundamental component in the proposed computational spectroscopy system^57,58,59.

It is worth mentioning that using TiO₂ as a high-refractive-index material instead of SiNx, which was used in previous work²³, could reduce the number of layers required to achieve the unique transmission functions. Moreover, by utilizing stencil lithography, the MTF filter can be fabricated through a simplified process that eliminates the need for an etching step.

SEM analysis of the MTF filter

The Ultra-High-Resolution Field Emission Scanning Electron Microscope (UHR-FE-SEM, model Verios 5 UC) image in Fig. 1a provides detailed elemental information. Energy Dispersive X-ray Spectroscopy (EDX) analysis of the fabricated Ge/SiO₂/TiO₂ multilayer thin films on borosilicate glass substrates revealed trace amounts of Na and C. The presence of Na is attributed to possible ion diffusion from the borosilicate substrate, which inherently contains alkali metals. The detected carbon is likely due to surface contamination from ambient exposure or handling, a common occurrence in surface-sensitive characterization techniques.

Figure 1b, also captured using the UHR-FE-SEM (Verios 5 UC), shows the surface morphology of the MTF filter array. Minimal measurement deviations are observed, ranging from 0.0032 to 0.0075 $\upmu$m. The SEM image displays each filter unit as approximately 392.5 $\times$ 392.5 $\upmu$m², while the designed dimensions are 400 $\times$ 400 $\upmu$m². Similarly, the inter-filter spacing is measured as 296.8 $\upmu$m against the intended 300 $\upmu$m. The deposited Ge layer is designed to cover an area of 500 $\times$ 500 $\upmu$m, whereas the measured dimension from the SEM image is approximately 493.3 $\times$ 493.3 $\upmu$m.

Figure 1c shows a uniform deposition of SiO₂/TiO₂ layers both within a single filter and consistently across multiple filters. This cross-sectional image was obtained using the Hitachi Focused Ion Beam (FIB) system, model NX5000, which also integrates SEM imaging functionality. Although attempts were made to image the cross-section with the UHR-FE-SEM (Verios 5 UC), the NX5000 FIB system yielded superior cross-sectional contrast and clarity for our sample. Figure 1c specifically corresponds to Filter 36 and confirms both uniformity in deposition and consistency in filter width.

The fourth layer in this cross-section appears to be the thickest, as it combines Layer 4 (SiO₂), Layer 5 (omitted TiO₂), and Layer 6 (SiO₂). According to Table 1, the individual thicknesses are 188 nm (SiO₂), 0 nm (TiO₂), and 109 nm (SiO₂), summing to a theoretical total of 297 nm. Due to the omission of TiO₂ (Layer 5), this segment represents a merged SiO₂ region. The SEM-measured thickness for this combined region is 249 nm, indicating a negligible difference of 48 nm within the acceptable measurement tolerance. The SEM system’s correction error is noted to be 0.0791 $\mu$m, supporting the validity of the measured thickness values.

Table 1 MTF filter thickness information.

Full size table

A total of eleven distinguishable layers were observed instead of the expected 19 layers. This is due to the apparent merging of layers 4 and 6, 8 and 10, and 13 and 15, along with the omission of layers 5, 9, 14, 16, and 17, possibly during deposition or imaging. The analysis excluded regions above and below the yellow rectangle in the SEM image. The top region corresponds to the filter surface, while the bottom includes the base layer of Ge and borosilicate wafer, which exhibit electron charging effects typical of insulating substrates during SEM imaging, and this is observed as a deflected bright region. Additional thickness measurements presented in Fig. 1c further confirm the precision and consistency of the multilayer deposition across the filter structure.

Problem formulation: a system model

The proposed study focuses on reconstructing spectral information from single-shot intensity measurements obtained through an MTF filter array on a CMOS sensor, a key function of the proposed DL-based computational spectrometer. So, for problem formulation, let, ${y}$ represents the vector of intensities measured by the CMOS sensor, ${x}$ denotes the vector of unknown incident spectrum values at different wavelengths, ${T}$ be the transmission matrix of the MTF filters, ${Q}$ be the diagonal matrix representing the quantum efficiency of the CMOS sensor, and ${n}$ be the measurement noise of system. The relationship between the measured intensities ${y}$ and the incident spectrum ${x}$ is given by Equation 1.

$$\begin{aligned} {y} = {T} {Q} {x} + {n} \end{aligned}$$

(1)

where:

$$\begin{aligned} {x} = \begin{bmatrix} x(\lambda _1) \\ x(\lambda _2) \\ \vdots \\ x(\lambda _N) \end{bmatrix}^{N \times 1} \end{aligned}$$

represents the spectrum at different wavelengths $\lambda _j$ (with $j = 1, 2, \dots , N$).

$$\begin{aligned} {T} = \begin{bmatrix} T_1(\lambda _1) & T_1(\lambda _2) & \cdots & T_1(\lambda _N) \\ T_2(\lambda _1) & T_2(\lambda _2) & \cdots & T_2(\lambda _N) \\ \vdots & \vdots & \ddots & \vdots \\ T_M(\lambda _1) & T_M(\lambda _2) & \cdots & T_M(\lambda _N) \end{bmatrix}^{M \times N} \end{aligned}$$

where $T_i(\lambda _j)$ is the transmission function of the $i$-th MTF filter at wavelength $\lambda _j$, and there are $M$ filters.

$$\begin{aligned} {Q} = \begin{bmatrix} Q(\lambda _1) & 0 & \cdots & 0 \\ 0 & Q(\lambda _2) & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & Q(\lambda _N) \end{bmatrix}^{N \times N} \end{aligned}$$

where $Q(\lambda _j)$ is the spectral response (quantum efficiency) of the CMOS sensor at wavelength $\lambda _j$. If the sensor had interference between wavelengths (e.g., if detecting light at $\lambda _1$ affected the response at $\lambda _2$), then $Q$ would contain nonzero off-diagonal elements. However, in a CMOS sensor, each wavelength $\lambda _j$ has an independent response, so only the diagonal elements are nonzero; this is why $Q$ is taken to be a diagonal matrix. It simplifies computation in spectral reconstruction. Since $Q$ is diagonal, multiplying it with $x$ (the incident spectrum) is straightforward.

$$\begin{aligned} {Q_x} = \begin{bmatrix} Q(\lambda _1) x(\lambda _1) \\ Q(\lambda _2) x(\lambda _2) \\ \vdots \\ Q(\lambda _N) x(\lambda _N) \end{bmatrix}^{N \times N} \end{aligned}$$

This means that each wavelength’s intensity is simply scaled by its respective quantum efficiency, making calculations more efficient.

$$\begin{aligned} A = T Q = \begin{bmatrix} T_1(\lambda _1) & T_1(\lambda _2) & \cdots & T_1(\lambda _N) \\ T_2(\lambda _1) & T_2(\lambda _2) & \cdots & T_2(\lambda _N) \\ \vdots & \vdots & \ddots & \vdots \\ T_M(\lambda _1) & T_M(\lambda _2) & \cdots & T_M(\lambda _N) \end{bmatrix}^{M \times N} \begin{bmatrix} Q(\lambda _1) & 0 & \cdots & 0 \\ 0 & Q(\lambda _2) & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & Q(\lambda _N) \end{bmatrix}^{N \times N} \\ = \begin{bmatrix} T_1(\lambda _1) Q(\lambda _1) & T_1(\lambda _2) Q(\lambda _2) & \cdots & T_1(\lambda _N) Q(\lambda _N) \\ T_2(\lambda _1) Q(\lambda _1) & T_2(\lambda _2) Q(\lambda _2) & \cdots & T_2(\lambda _N) Q(\lambda _N) \\ \vdots & \vdots & \ddots & \vdots \\ T_M(\lambda _1) Q(\lambda _1) & T_M(\lambda _2) Q(\lambda _2) & \cdots & T_M(\lambda _N) Q(\lambda _N) \end{bmatrix}^{M \times N} \\= \begin{bmatrix} A_1(\lambda _1) & A_1(\lambda _2) & \cdots & A_1(\lambda _N) \\ A_2(\lambda _1) & A_2(\lambda _2) & \cdots & A_2(\lambda _N) \\ \vdots & \vdots & \ddots & \vdots \\ A_M(\lambda _1) & A_M(\lambda _2) & \cdots & A_M(\lambda _N) \end{bmatrix}^{M \times N} \end{aligned}$$

Each element $A_i(\lambda _j)$ is given as $A_i(\lambda _j) = T_i(\lambda _j) Q(\lambda _j)$. This represents the combined transmission of the $i$-th MTF filter and the spectral response of the CMOS sensor at wavelength $\lambda _j$. So, by combining the transmission matrix ${T}$ and the spectral response matrix ${Q}$ into a sensing matrix ${A}$, the system Equation 1) simplifies to Equation 2.

$$\begin{aligned} {y} = {A} {x} + {n} \end{aligned}$$

(2)

Here, ${y}$ represents the intensities recorded by the CMOS sensor for each of the ${M}$ filters.

$$\begin{aligned} {y} = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_M \end{bmatrix}^{M \times 1} \end{aligned}$$

The following section discusses how a DL algorithm, U-Net, based on convolutional neural networks (CNN), is integrated into the system. The DL-based architecture processes the captured intensities to reconstruct the whole spectrum with high accuracy, leveraging the tailored properties of the MTF filter array to enhance resolution and minimize errors.

DL-based computational spectroscopy

The DL-based computational spectroscopy leverages neural networks to reconstruct spectra from compressed or indirect measurements, facilitating the design of compact, low-cost spectrometers. Unlike traditional spectrometers that rely on precise optical components to directly measure specific wavelengths, DL-based systems utilize broad-spectrum data captured through photonic structures such as multilayer thin films or filter arrays. The relationship between the measured intensities ${y}$ and the true spectrum ${x}$ is already derived as ${y} = {A} {x} + {n}$ (Eq. 2), where ${A}$ denotes the system’s response matrix, and ${n}$ represents noise. In this context, a U-Net architecture is employed to approximate the mapping between the captured intensities ${y}$ and the reconstructed spectrum $\hat{{x}}$ as Eq. (3):

$$\begin{aligned} \hat{{x}} = f_{\text {U-Net}}({y}; \theta ) \end{aligned}$$

(3)

where $\theta$ comprises the trainable parameters (weights and biases) of the U-Net model.

To evaluate the reconstruction performance, the RMSE is considered over Mean Squared Error $MSE = \frac{1}{n} \sum _{i=1}^{n} ( x_i - {\hat{x}}_i )^2$ because is in the same units as the target variable, making it easier to interpret. Additionally, RMSE penalizes larger errors more due to the squaring of differences, making it more sensitive to outliers. Because of these characteristics, it is utilized as the loss function given by Eq. (4), which quantifies the difference between the ground truth (GT) spectra ${x}$ and the reconstructed spectra $\hat{{x}}$:

$$\begin{aligned} {\mathcal {L}}(\theta ) =\sqrt{|| \hat{{x}} - {x}||^2_2}= \sqrt{\frac{1}{n} \sum _{i=1}^{n} ( x_i - {\hat{x}}_i )^2} \end{aligned}$$

(4)

The optimization of the U-Net parameters is performed using the Adam optimizer⁶⁰, which updates the parameters based on the gradient of the RMSE loss function:

$$\begin{aligned}g_t = \frac{\partial {\mathcal {L}}}{\partial \theta } \bigg |_{\theta = \theta _t}\end{aligned}$$

The first moment $m_t$ and the second moment $v_t$ of the gradients are updated as follows:

$$\begin{aligned} m_t = \beta _1 m_{t-1} + (1 - \beta _1) g_t \\ v_t = \beta _2 v_{t-1} + (1 - \beta _2) g_t^2 \end{aligned}$$

where $\beta _1$ and $\beta _2$ are decay rates for the first and second moments, respectively. The bias-corrected moment estimates are given by:

$$\begin{aligned} {\hat{m}}_t = \frac{m_t}{1 - \beta _1^t}, \quad {\hat{v}}_t = \frac{v_t}{1 - \beta _2^t} \end{aligned}$$

Finally, the model parameters are updated as follows:

$$\begin{aligned} \theta _{t+1} = \theta _t - \eta \frac{{\hat{m}}_t}{\sqrt{{\hat{v}}_t} + \epsilon } \end{aligned}$$

where $\eta$ is the learning rate, and $\epsilon$ is a small constant used to prevent division by zero. This DL-based approach not only enhances spectral resolution but also accelerates measurement processes, resulting in smaller, mobile-friendly devices that are suitable for on-site applications.

Experimentation

Experimental setup

Figure 2a depicts a schematic of our experimental setup. A collimating beam is divided into two beams after passing through a color filter and a long pass filter. A spectrum of a single split beam was measured using a commercial spectrometer (Black-Comet, StellarNet), which served as the GT. The other beam was fed into the MTF filter array and modulated by the transmission functions of the filters. The modulated intensities of the beam were captured with a CMOS camera (EO-1312M, Edmund optics) as a monochrome image with a single exposure. By connecting the spectrometer and CMOS camera to a laptop using universal serial bus (USB) cables, we simultaneously collected a monochrome image of the GT spectrum. We collected a dataset comprising 3223 pairs of GT spectra and corresponding monochrome images. The dataset comprised 350 narrow spectra with an FWHM of 4 nm and 2873 broad spectra. To collect narrow spectra, we used a halogen lamp (KLS-150H-LS-150D, Kwangwoo) and a monochromator (MMAC-200, Mi Optics). A beam from the halogen lamp was fed into the monochromator, generating a narrow spectrum. By changing the peak locations of narrow spectra from 500 to 849 nm with 1 nm spacing, we measured 350 spectra. To collect broad spectra, we generated various shapes of spectra using color filters (Color filter booklet, Edmund optics) as shown in Fig. 2a. A beam from the halogen lamp was modulated by color filters, generating a broad spectrum. By changing combinations of color filters, we measured 2873 broad spectra of various waveforms.

The captured monochrome image had a size of 1280 $\times$ 1024 pixels, and the GT spectrum comprised a signal of 350 elements measured over the wavelength range, $\lambda$, of 500–850 nm with a 1 nm spacing. As shown in Fig. 2b, we extracted 36 intensities from the filter array in the monochrome image. These intensities were fed into the DL architecture to reconstruct the spectra. Figure 2c shows examples of reconstructed test spectra using the trained DL architecture. The RMSE between the GT spectra (dashed black lines) and reconstructed spectra (solid blue lines) was used to evaluate the reconstruction performance. The reconstructed spectra were consistent, as indicated by the RMSE values listed in the upper left corner of each plot in this figure.

Figure 3a shows a photograph and an optical microscope image of the fabricated MTF filter array. The MTF filter array had a 6 $\times$ 6 square grid shape. Each MTF filter had a size of 400 $\times$ 400 $\upmu$m², and the filters were 300 $\mu$m apart. The entire size of the filter array was 4.5 $\times$ 4.5 mm². Each MTF filter had its own color due to its unique transmission function, T, and the color was uniform across each filter. Figure 3b shows examples of spectral sensitivities of the fabricated filters with the CMOS camera. Unlike bandpass optical filter-based spectrometers, the MTF filter-based computational spectrometer modulates the spectrum of incident light in a wide wavelength range with broad spectral sensitivities. Therefore, a few filters are sufficient to measure the spectral information of the incident light uniquely. Figure 3c shows the measured data. The CMOS camera and halogen lamp were calibrated to extract intensities from monochrome images in the range of 0–255, and the CMOS camera’s auto contrast function was turned off. The spectra were measured with a fixed integration time. A single pixel of the CMOS camera had a size of 5.2 $\times$ 5.2 $\upmu$m². Underneath each filter, there were approximately 70 $\times$ 70 pixels. From the monochrome image of the MTF filter array illuminated by the incident light (left), 36 measured intensities were extracted by taking the average value of central 40 $\times$ 40 pixels of each filter as one measured intensity. For pixels at the filter boundary, there might have been a misalignment during fabrication, and a beam passing through a filter could overlap another beam in experiments. Therefore, we excluded pixels on the filter boundaries. Three examples of measured intensities are plotted in the center of Fig. 3c, corresponding to the GT spectra on the right of Figure 3c.

DL-architecture

The proposed DL architecture consists of a dense layer and a U-Net backbone⁶¹ with residual connections. Before entering the U-Net backbone, 36 measured intensities were extended to a size of 350 by applying a linear transformation using the dense layer. This extension enabled the U-Net backbone to become deeper, which could be beneficial for feature extraction and reconstruction. The U-Net backbone comprises a contracting path and an expansive path. In the first stage of the contracting path, extended intensities go through the main branch that comprises a one-dimensional (1D) convolution (Conv), 1D batch normalization (BaN), rectified linear unit (ReLU), and Conv. As a shortcut branch, the extended intensities go through a Conv. The two branches are added to form a residual connection. The output of the residual connection becomes the input of the next stage of the contracting path. Like the first stage, the input of the second stage of the contracting path goes through the main and residual branches and becomes a residual connection. In the main branch, we reduce input size by a factor of 2 and double the number of feature maps using two sets of BaN, ReLU, and Conv. We used Conv with stride 2 to reduce the size. In the residual branch, the input goes through Conv with stride 2 and BaN. The output of the contracting path is upsampled by applying a 1D transposed convolution (ConvTrans). It is concatenated with the corresponding feature maps from the contracting path to serve as the input to the first stage of the expansive path. The input is routed through the main and residual branches and then summed. These upsampling, concatenation, and summation were repeated four times. The output of the expanding path passes through a convolutional layer to become a signal with 350 elements. Finally, the output signal and extended intensities were added to become a reconstructed spectrum.

By leveraging the summation between the extended intensities and the output signal of the U-Net backbone, the U-Net backbone learns the residue between the extended intensities and ground truth spectra. The learning residue is more effective than directly learning target spectra⁶². Residual connections in the U-Net backbone prevent the gradient vanishing problem, which can stop the update of learnable parameters in a deep learning (DL) architecture during the training process⁶³. Additionally, it is possible to enhance the depth of the DL architecture.

Training and testing

The proposed DL architecture was trained on our dataset, minimizing a mean squared error between the reconstructed and GT spectra. We randomly divided the dataset into training, validation, and test sets, each containing 2576, 324, and 323 pairs, respectively. Before training the DL architecture, we performed data preprocessing. The measured intensities from a monochrome image were divided by the maximum value of intensities, and the corresponding GT spectrum was min-max normalized. Therefore, we trained the DL architecture to reconstruct the unknown spectra in the normalized intensity form.

Using the validation set, we monitored the performance of the DL architecture for every epoch during the training process. As such, we can select the number of epochs before overfitting occurs. The selected number of epochs was 400. The batch size and learning rates were 8 and 0.0005, respectively. The training process was completed within $\sim$1.4 hours, and reconstruction results on the test set were obtained within $\sim$2 seconds. The DL architecture was built on the PyTorch framework⁶⁴. The training and testing were performed on a computer equipped with an Intel Core i7-5820K CPU and an NVIDIA GeForce RTX 2060 graphics processing unit.

Results and discussions

Reconstruction of test spectra

Figure 4 shows the reconstruction results of the proposed computational spectrometer. The RMSE distribution of 323 test spectra is as shown in the histogram in Fig. 4a. Blue and orange boxes represent the RMSE distribution of 33 narrow and 290 broad spectra, respectively.

Three examples of the best and worst spectral reconstructions are shown in Fig. 4b, c, respectively. Dashed black lines represent GT spectra and solid-colored lines represent the reconstructed spectra. The RMSE value is written in the upper left corner of each graph. The error, defined as $\hat{{x}}-{x}$, is plotted at the bottom of each graph.

The average RMSE of all test spectra was 0.0288. The average RMSEs of the narrow and broad spectra were 0.0158 and 0.0303, respectively, indicating better results in the reconstruction of the narrow spectra. As shown in Fig. 4b, the reconstructed spectra of the best examples were almost the same as the GT spectra. The reconstructed spectra exhibited abrupt changes in narrow peaks, with peak positions matching well. Moreover, the reconstructed spectra of the worst examples did not accurately reflect the waveform changes of the GT spectra. Excluding the best and worst cases, the DL architecture accurately recovered the test spectra, as shown in the RMSE distribution in Fig. 4a.

Figure 5 shows spectral reconstructions of narrow spectra with an FWHM of 4 nm in the test dataset. We evenly present the reconstruction results from the test set according to peak locations. The peak locations of GT spectra in Fig. 5a–f are 520, 588, 655, 707, 768, and 834 nm, respectively. Solid colored lines represent the reconstructed spectra, and dashed black lines represent GT spectra. The RMSE values of the reconstructions are 0.0124, 0.0133, 0.0149, 0.0177, 0.0144, and 0.0198, respectively. The reconstructed spectrum exhibits spectral features characterized by narrow peaks with significant intensity changes near the peak locations and no intensity changes elsewhere, except at the peak locations. The reconstructed spectra matched the steep increment of narrow peaks from the enlarged inset graphs. The peak ___location differences between the reconstructed and GT spectra are within 1 nm, and the FWHMs of the reconstructed spectra are within 5 nm. The DL model was able to accurately reconstruct the narrow spectral peaks from the test dataset, regardless of their ___location within the spectrum.

Figure 6 shows spectral reconstructions of broad spectra in the test set. According to the interval of the histogram (Fig. 4a), we present the reconstruction results of broad spectra. Solid orange lines represent reconstructed spectra, and dashed black lines represent GT spectra. The RMSE values of the reconstructions in Fig. 6a–f are 0.0164, 0.0231, 0.0324, 0.0412, 0.0469, and 0.0523, respectively. The reconstructed spectra matched well with the spectral features of the GT spectra. For example, a broad background band with multiple peaks is well-expressed in Fig. 6a–c, and spectral valleys are well-represented in Fig. 6d, e. The reconstruction of the flat-top shape in Fig. 6f matches well with the GT spectrum. In addition, from the error, the differences between the reconstructed and GT spectra are within 0.1. Overall, the proposed DL architecture represents various spectra features of the broad spectra.

From Fig. 4, 5 and 6, we demonstrate the spectral reconstruction performances of the proposed DL architecture. The DL architecture can recover narrow and broad spectra in fine detail. In particular, it could be overfitted to broad spectra due to the different proportions of narrow and broad spectra. However, narrow spectra can be well represented through the depth of layers and the numerous learnable parameters of the DL architecture. Unlike numerical optimization methods that require spectral sensitivities, such as transmission functions and sparsifying bases, the proposed DL architecture does not require prior information to recover unknown spectra. The DL architecture requires a dataset for training, but it provides the reconstruction result end-to-end after training is complete. In addition, the proposed DL architecture reconstructed 323 test spectra within $\sim$2 s, which is not possible using numerical optimization methods. This is a significant advantage over numerical optimization methods, which require human intervention for precise parameter tuning in spectral reconstruction.

Reconstruction of drink spectra

We further explored the spectral resolving ability of the proposed computational spectrometer with commercial drinks. Reconstructions of transmission spectra for five drinks were performed using the trained DL architecture. The samples of five drinks were prepared using disposable polystyrene cuvettes with a capacity of 4 ml (Fig. 7a). The monochrome images and transmission spectra of drinks were measured using the experimental setup depicted in Figure 2a by replacing color filters with drink samples. From the monochrome image, we extracted 36 intensities and fed them into the trained DL architecture, obtaining the reconstruction result. The reconstructed transmission spectra of five drinks are illustrated in Fig. 7b–f. Dashed black lines represent GT spectra, and solid lines represent the reconstructed spectra except for light gray. Solid light gray lines represent the reconstructed transmission spectra obtained using the numerical optimization method of sparse recovery. We only used the trained DL architecture without human intervention to reconstruct the transmission spectra. On the other hand, we required prior information on spectral sensitivities, the optimal sparsifying basis, and numerous interventions to determine the best parameters for sparse recovery. The RMSE for each drink is written in the upper left corner of the graph. The average RMSE of the reconstructed transmission spectra using the DL architecture is 0.0554. The average RMSE using sparse recovery is 0.1648. As shown in Fig. 7, the reconstruction results of the DL architecture match well with the GT spectra. However, the reconstructed spectra from sparse recovery differed significantly from the GT spectra. The difference in reconstruction performance between the DL architecture and sparse recovery appears to be due to background noise. Because the DL architecture is trained using data with background noise, it shows stable reconstruction performance over the noise. However, sparse recovery is sensitive to noise and works well for precisely measured intensities.

Key considerations in practical scenarios

Environmental factors

In the current experimental setup (see “Experimental setup”), our computational spectrometer is deployed in controlled laboratory conditions, whereas deploying such a spectrommeter outside the controlled environment in real scenarios may introduce challenges related to temperature fluctuations, humidity, optical scattering, and external noise, all of which can affect the optical response of the multilayer thin film (MTF) filters and consequently the measured spectral signal. In our proposed system, the MTF filter array is fabricated using alternating layers of TiO₂ and SiO₂, deposited via DC and RF magnetron sputtering under tightly controlled gas flows and power settings. The resulting dielectric multilayers exhibit robust and stable interference-based transmission spectra. Moreover, both TiO₂ and SiO₂ are chemically inert and thermally stable upto 400 °C^65,66, mainly when deposited on borosilicate glass, which further enhances resistance to environmental stress⁶⁷. Argon was used as the inert carrier gas during sputtering to prevent unwanted chemical reactions, while oxygen was added precisely to control stoichiometry, ensuring reproducibility across filter batches. However, despite these fabrication advantages, it is believed that environmental variables may still cause slight shifts in the filters’ transmittance due to humidity-induced changes in refractive index or thermal expansion, which can be modeled as a perturbation. To capture the influence of environmental factors, the transmission matrix of the proposed model Equation 1 can be modeled as:

$$\begin{aligned} T = T_0 + \Delta T(T_{\text {env}}, H, S, R) \end{aligned}$$

where $T_0$ is the nominal filter transmission matrix under standard conditions, and $\Delta T \in {\mathbb {R}}^{M \times N}$: is an environment-dependent perturbation. Specifically, $\Delta T$ is modeled as a function of key physical variables: temperature ($T_{\text {env}}$), relative humidity ($H$), optical scattering or misalignment ($S$), and ambient reflectance or stray light ($R$). These variables can affect the refractive indices and physical thicknesses of the dielectric layers, thereby altering the spectral characteristics of the filters. As a result, the effective sensing matrix becomes:

$$\begin{aligned} A = (T_0 + \Delta T) Q = A_0 + \Delta A \end{aligned}$$

Substituting into the forward model yields:

$$\begin{aligned} y = (A_0 + \Delta A) x + n = A_0 x + \Delta A x + n \end{aligned}$$

The perturbation term $\Delta A x$ represents an environment-induced distortion in the measurement, which varies depending on both the external conditions and the input spectrum. To mitigate the impact of such variations, we can adopt two complementary strategies. First, we can simulate perturbations $\Delta A$ during the training phase of our DL-based reconstruction model, enabling the network to learn robust representations of a wide range of environmental conditions. These synthetic perturbations are intended to be derived from empirical measurements and physics-based models of thin-film behavior under varying temperature and humidity conditions. Second, we plan to integrate environmental sensors, such as miniature temperature and humidity sensors, into the spectrometer housing. These sensors allow for real-time estimation of the environmental state, which can be used to approximate $\Delta A$ and perform a compensation step before reconstruction:

$$\begin{aligned} \Delta y = \Delta A x \\ y_{\text {corrected}} = y - \Delta y \end{aligned}$$

This correction pipeline an enhance the reliability of spectral reconstruction under fluctuating conditions, providing a foundation for adapting the system to uncontrolled outdoor or industrial environments. By modeling and addressing the effects of environmental perturbations on the MTF filter array, an extended version of the proposed model may improve the spectrometer’s applicability beyond laboratory settings, ensuring robust and accurate operation in real-world scenarios. For the proposed MTF-filter-based spectrometer, incorporating the known system matrix ${\bf{A}}$ into the loss function or embedding it as a layer within the network architecture (e.g., $\hat{{\bf{x}}} = f_{\theta }({\bf{A}}, {\bf{y}})$) can help directly enforce the physical measurement model during training. This hybrid strategy ensures that the reconstructed spectra remain consistent with both the learned representations and the known optical characteristics of the TiO₂/SiO₂ multilayer thin-film filter stack. Even if our MTF filter arrays are only tested in a laboratory environment, which is very mild, the robustness of TiO₂ and SiO₂ films deposited by PECVD (Plasma-Enhanced Chemical Vapor Deposition) may adapt to the complex application environment under various environments (temperature, humidity, light illumination conditions, etc.)⁶⁸.

Spectra out of learning distribution and overfitting

If the DL model is not well-trained with diverse data, it may underperform when faced with spectra that fall outside the training distribution, meaning. When a trained model encounters spectra outside its training distribution, its reconstruction accuracy may drop due to poor generalization. To address this, uncertainty estimation methods like Monte Carlo Dropout can help identify unreliable predictions^69,70. Monte Carlo Dropout is highly effective for DL models, especially in scenarios involving uncertainty, limited training data, or out-of-distribution inputs. By enabling dropout during inference, it provides not just predictions but also confidence estimates, helping detect when the model is unsure. This is crucial in applications such as spectroscopy, where unseen spectral patterns may emerge. Unlike traditional deterministic models, MC Dropout provides a scalable Bayesian approximation without requiring modifications to the network architecture. Similarly, to improve robustness against spectra outside the training distribution, a Spectral Information Divergence (SID)-based sparse representation classifier can be employed. Unlike traditional L2-norm measures, SID evaluates the probabilistic divergence between spectral signatures, preserving spectral characteristics more effectively and enhancing generalization to unseen spectral patterns.⁷¹. Out-of-distribution (OOD) detection is crucial for identifying spectra from unknown classes that were not seen during training, thereby improving model robustness in real-world settings. Techniques like OpenPCS-Class leverage intermediate network features to distinguish in-distribution data from OOD, enabling reliable multi-class classification beyond the training ___domain⁷². Additionally, hybrid approaches that incorporate physical priors can constrain predictions, thereby improving reliability. These strategies collectively improve model robustness in real-world spectral sensing scenarios, a few of them are reviewed in Table 2. It is important to note that while these benefits are significant, they come at the cost of increased computational complexity at the DL model level. This complexity can lead to slower reconstruction and decision-making processes, which we believe are unsuited for battery-powered devices such as smartphones and other mobile platforms.

Table 2 Hybrid DL techniques with physics-based priors.

Full size table

During the training of the DL model, overfitting was carefully considered, and regularization techniques were employed, including a dropout rate of 50% (i.e., randomly dropping out 50% of neurons). In addition to the aforementioned techniques, such as OOD validation and Monte Carlo methods, overfitting can be further mitigated using data augmentation strategies, such as noise injection and synthetic data generation, which expose the model to a broader range of spectral patterns^80,81,82. Additionally, L2 regularization can help control model complexity and prevent overfitting⁸³.

Reproducibility and scalability

To validate reproducibility, we fabricated 169 identical multilayer thin-film (MTF) filter arrays on a single wafer, as shown in Fig. 2a. Each filter array follows a 6 $\times$ 6 square grid pattern with precise dimensions (400 $\times$ 400 $\mu$m², 300 $\mu$m spacing), yielding a total array size of 4.5 $\times$ 4.5 $\mu$m². The high degree of uniformity in filter fabrication was achieved through precise control over the deposition process. For mass production, our method utilizes direct current (DC) and radio frequency (RF) magnetron sputtering with precisely controlled deposition parameters. The use of TiO₂ (high refractive index) and SiO₂ (low refractive index) ensures stable optical properties, with deposition thicknesses carefully regulated via shadow masks. By omitting specific layers in a controlled manner, we achieved 36 unique MTF filters while maintaining high reproducibility across batches. The exact thickness values of each deposited layer are provided in Table 1, demonstrating the controlled variation in layer configurations while maintaining deposition accuracy.

Stencil lithography provides a simplified and scalable alternative to conventional etching-based processes, thereby reducing fabrication complexity and the potential inconsistencies introduced by additional processing steps. Moreover, the lift-off process, which utilizes photoresist and Ge deposition, further enhances the structural consistency of the filter arrays. To assess the feasibility of large-scale production, we analyzed the thickness variations across different filters. The provided thickness table confirms that the designed layer configurations were accurately achieved, with controlled variations where required (e.g., selective omission of intermediate layers). Future work will further explore batch-to-batch variations by fabricating multiple wafers and analyzing spectral deviations to gain a deeper understanding of these variations. However, the results from our current fabrication indicate that the technique is promising for reproducible, high-throughput production of MTF filter arrays with consistent spectral characteristics.

Furthermore, to mitigate reproducibility and scalability issues in wafer-level stencil lithography of multilayer thin-film filters, advanced optical monitoring techniques during the deposition process are crucial. One practical approach is polychromatic optical monitoring, which uses multiple monitoring wavelengths instead of a single one, as in conventional monochromatic monitoring⁸⁴. This method enables real-time tracking of layer thicknesses and spectral performance, allowing for partial compensation of accumulated thickness errors in previously deposited layers. By comparing real-time transmittance measurements with theoretical models, the monitoring system can dynamically adjust deposition parameters, thereby enhancing uniformity across large batches and improving the precision of spectral reconstruction. Simulation-driven algorithms can further optimize the choice of monitoring wavelengths, tailored to the filter design and material dispersion properties.

Critical analysis on extension of the proposed MTF filter

We recognize the significance of UV and IR spectrometers in their various applications. We evaluated the applicability of our spectrometer for broader spectral domains. However, it is clearly stated that we intentionally designed our MTF filter array for the range from 500 to 850 nm, which strikes a practical balance between spectral performance, material feasibility, and economic viability, particularly for compact and low-power applications. We do not advocate extending the present filter design to UV and IR from its current spectrum range, as this would necessitate significant modifications at multiple levels. In terms of filter design, UV compatibility often requires materials such as HfO₂ and SiO₂ layers⁸⁵, new materials such as GaN (Gallium Nitride), InGaAs (Indium Gallium Arsenide), and InP (Indium Phosphide) that are excellent for cutting-edge semiconductor industry procedures for 200 to 700 nm (the UV-VIS range)⁸⁶, MgF₂ or Al₂O₃ to ensure adequate transmission from the UV region of 230 nm to the beginning of VIS region of 400 nm⁸⁷, whereas IR filters rely on substrates like Ge or ZnSe, which are not only cost-intensive but also present fabrication challenges due to higher absorption losses and thermal sensitivity in multilayer stacks^88,89,90. Additionally, achieving uniform and reproducible deposition of these materials on a wafer scale further complicates scalability. From a DL perspective, broadening the spectral range introduces new complexities. Training models for UV or IR demands large volumes of representative spectral data, which are often scarce or expensive to acquire.

Note that filter reconfiguration would necessitate recalibration and retraining of the model to match new spectral response profiles, incurring additional computational and labor overhead. Such demands compromise the low-cost, lightweight nature of our current solution. Sensor limitations also play a pivotal role: while our system is optimized for CMOS sensors readily available for the VIS-NIR range, extending to UV would require GaN or SiC-based sensors, and IR detection would necessitate InGaAs or HgCdTe sensors, which would significantly increase the overall system cost and power consumption. In contrast, our current design maintains a 1 nm wavelength resolution (ability to reconstruct a narrow peak without distortion) within a carefully chosen range from 500 to 850 nm, which is suitable for many real-world applications, such as biomedical imaging, agriculture, and environmental monitoring, where UV or far-IR data are typically non-critical. By focusing on this specific band, we retain advantages in manufacturing simplicity, system compactness, energy efficiency, and cost-effectiveness, making the spectrometer highly suited for battery-powered or mobile deployments. Since the spectral bandwidth can be further extended to the UV and IR range by engineering the filter design of the MTF filter. Furthermore, our deep learning model relies on spectral variability in our filter design, so we believe that spectral reconstruction performance becomes more robust by increasing spectral variability through the adoption of nanoarchitectures, such as plasmonic nanoantennas and photonic nanocavities, which is one of our ongoing projects.

Comparison with conventional and modern spectrometers

Conventional spectrometers primarily include grating-based, prism-based, Fabry-Pérot, and Fourier Transform Infrared (FTIR) spectrometers. Grating and prism spectrometers operate on the principles of spatial dispersion and have long been used due to their simplicity and reliability⁹¹. Fabry-Pérot interferometers, renowned for their tunability and compact form, are frequently utilized in high-resolution, narrowband applications^92,93. FTIR spectrometers, particularly using Michelson interferometers, are standard in infrared spectroscopy due to their broad spectral coverage and high resolution⁹⁴. While newer approaches, such as computational and metasurface spectrometers, are emerging, conventional types remain dominant in both commercial and laboratory settings⁹⁵. These spectrometers and their operating principles are well discussed in surveys such as^96,97. The trade-offs between conventional and filter-based computational spectrometers primarily revolve around balancing spectral resolution, hardware complexity, and system integration. Conventional spectrometers, such as grating-based or Fourier-transform systems, offer exceptional spectral precision and broad wavelength coverage, making them the gold standard for laboratory-based applications requiring detailed spectral analysis. These include material characterization, remote sensing, and advanced biomedical diagnostics, where high resolution and broad spectral range are critical. Several state-of-the-art computational spectrometers have recently been introduced, each with its advantages and disadvantages. For example, Yang et al. presented a computational spectrometer with the world’s smallest footprint using a single band-graded nanowire²². Yoon et al. reported single-pixel implantable computational spectrometer designs based on a single van der Waals junction, offering considerable advantages in footprint, spectral resolution, and bandwidth^19,20,21. Still, their approaches require further engineering for commercialization, as the large-scale integration of nanowires and two-dimensional materials is not yet compatible with CMOS image sensors, resulting in high-cost and complex fabrication processes.

In contrast, the proposed spectrometer, based on an MTF filter array with a wavelength resolution of 1 nm, a spectral range of 500 to 850 nm, and a compact footprint of 4.5 $\times$ 4.5 mm², is optimized for rapid, single-shot spectral reconstruction using DL. A comparison with conventional, semi-conventional, and modern spectrometers is presented in Table 3. While a direct apple-to-apple comparison is not feasible, the provided comparison sufficiently demonstrates the effectiveness of the proposed spectrometer. The resolution of the proposed spectrometer is adequate for capturing narrow spectral features relevant to applications such as noninvasive health monitoring, food quality inspection, vegetation assessment, and industrial material classification. Although it may not achieve the ultra-high spectral fidelity or broad wavelength coverage of traditional systems, particularly in the UV or IR regions, it offers a practical solution where speed, portability, and cost efficiency are prioritized. Its miniature size enables seamless integration into handheld, wearable, and mobile platforms, facilitating real-time diagnostics and monitoring in the field and point-of-care settings. Furthermore, the simplified optical architecture, combined with scalable thin-film fabrication and data-driven reconstruction, supports high-throughput, low-cost manufacturing, making it well-suited for deployment in resource-constrained environments and consumer electronics.

Table 3 Comparison of proposed computational spectrometer.

Full size table

Open research challenges

The proposed system demonstrated effective reconstruction over the range from 500 to 850 nm with a 1 nm spacing. Still, the spectra within our dataset were measured under the assumption that the angle of the incident light is normal to the MTF filter array. The measured intensities vary according to the angle of incident light, which can reduce the performance of the proposed model when it is under various angles of incident light. Furthermore, DL models, including the proposed U-Net architecture, may overfit to a smaller training dataset. This could reduce robustness when the model is subjected to spectra with new features or noise characteristics. Several research challenges have been identified that require attention from the research community to enhance the potential of HSI applications.

1.
The DL architecture requires substantial computational power for training, especially with large datasets and high-resolution applications, which restricts accessibility for those without advanced computational hardware.
2.
Achieving uniform layer thicknesses and precise transmission functions across the MTF filter array remains challenging with scalable wafer-level stencil lithography. Larger array sizes increase the likelihood of thickness variability, potentially impacting filter behavior and reconstruction accuracy.
3.
Changes in temperature, lighting, and possible contamination on sensors or filters may influence spectral measurement accuracy, so these factors for consistent performance in various real-world conditions may require additional preprocessing or adaptive modeling.
4.
Leveraging the MTF filters’ spatial information for real-time hyperspectral imaging demands additional hardware and careful calibration, making it challenging to maintain speed and resolution for practical use in hyperspectral applications.

Conclusion

This study presents a DL-based, single-shot computational spectrometer that combines a 6 $\times$ 6 MTF filter array with a CMOS camera, achieving a compact and cost-effective design suitable for mobile applications. The MTF filter array, fabricated using wafer-level stencil lithography, enables high-throughput production and confirms its uniformity with analysis through SEM and FIB, making it an affordable option for on-site detection and simple diagnostic tests. With the ability to reconstruct high-resolution spectra across a broad wavelength range (500–850 nm, 1 nm spacing) in approximately 0.006 seconds per spectrum, the spectrometer demonstrates efficient performance, achieving an average RMSE of 0.0288 for test spectra, including reliable transmission spectrum reconstructions for commercial beverages. The DL architecture, trained on 2576 data pairs, achieves rapid and accurate spectral reconstruction that is typically infeasible with traditional numerical optimization methods, highlighting the advantages of DL for real-time applications. The compact size, minimal optical complexity, and fast measurement capabilities make this spectrometer a promising tool for various applications, such as drink inspection, counterfeit document detection, and self-diagnosis. Moreover, due to the MTF filters’ capacity for spatial data acquisition, this setup could be expanded into a snapshot hyperspectral imaging system, maximizing its versatility in capturing spectral information efficiently. We believe that the proposed spectrometer is commercially viable and is one of the strongest promising solutions for miniaturized computational spectrometry. We are looking ahead toward fabrication for commercialization, provided the funding is made available.

Key terms and their definition

1.
Stoichiometry, in chemistry, the determination of the proportions in which elements or compounds react with one another. The rules followed in the determination of stoichiometric relationships are based on the laws of conservation of mass and energy and the law of combining weights or volumes.
2.
Perturbation mean small disturbance. When some external factors affect the system and the exact solution is not available, then the external factor affecting the system, is considered as a small perturbation to the system to find out the solution to explain its behavior and energy.
3.
Sputtering is a technique for depositing thin films onto a substrate by bombarding a target material with high-energy ions, causing atoms to be ejected and deposited as a thin layer on the substrate.
4.
A shadow mask is a template or mask with specific patterns or openings designed to allow certain materials to pass through while blocking others. It is used in techniques like physical vapor deposition (PVD) or chemical vapor deposition (CVD) to precisely deposit thin films onto substrates.
5.
Quantum efficiency (QE) of the detector is the ratio of the number of carriers that is collected by the detector to the number of photons incident on the detector.
6.
PECVD stands for Plasma-Enhanced Chemical Vapor Deposition, a technique used to deposit thin films from a gas state to a solid state on a substrate, utilizing plasma to enhance the chemical reactions and enable deposition at lower temperatures.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Henderson, B. & Imbusch, G. F. Optical Spectroscopy of Inorganic Solids. Vol. 44 (Oxford University Press, 2006).
Clark, R. N. & Roush, T. L. Reflectance spectroscopy: Quantitative analysis techniques for remote sensing applications. J. Geophys. Res. Solid Earth 89, 6329–6340 (1984).
Article CAS Google Scholar
Chang, C.-C. & Lee, H.-N. On the estimation of target spectrum for filter-array based spectrometers. Opt. Exp. 16, 1056–1061 (2008).
Article Google Scholar
Kurokawa, U., Choi, B. I. & Chang, C.-C. Filter-based miniature spectrometers: Spectrum reconstruction using adaptive regularization. IEEE Sens. J. 11, 1556–1563 (2010).
Article ADS Google Scholar
Donoho, D. L. Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006).
Article MathSciNet Google Scholar
Kravets, V. & Stern, A. Progressive compressive sensing of large images with multiscale deep learning reconstruction. Sci. Rep. 12, 7228 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Kallepalli, A., Innes, J. & Padgett, M. J. Compressed sensing in the far-field of the spatial light modulator in high noise conditions. Sci. Rep. 11, 17460 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Oliver, J., Lee, W., Park, S. & Lee, H.-N. Improving resolution of miniature spectrometers by exploiting sparse nature of signals. Opt. Exp. 20, 2613–2625 (2012).
Article CAS Google Scholar
Oliver, J., Lee, W.-B. & Lee, H.-N. Filters with random transmittance for improving resolution in filter-array-based spectrometers. Opt. Exp. 21, 3969–3989 (2013).
Article CAS Google Scholar
Koh, K., Kim, S.-J. & Boyd, S. An interior-point method for large-scale l1-regularized logistic regression. J. Mach. Learn. Res. 8, 1519–1555 (2007).
MathSciNet Google Scholar
Bioucas-Dias, J. M. & Figueiredo, M. A. A new twist: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process. 16, 2992–3004 (2007).
Article ADS MathSciNet PubMed Google Scholar
Beck, A. & Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009).
Article MathSciNet Google Scholar
Li, H. et al. A near-infrared miniature quantum dot spectrometer. Adv. Opt. Mater. 9, 2100376 (2021).
Article CAS Google Scholar
Cerjan, B. & Halas, N. J. Toward a nanophotonic nose: A compressive sensing-enhanced, optoelectronic mid-infrared spectrometer. ACS Photon. 6, 79–86 (2018).
Article Google Scholar
Huang, E., Ma, Q. & Liu, Z. Etalon array reconstructive spectrometry. Sci. Rep. 7, 40693 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Oiknine, Y., August, I., Blumberg, D. G. & Stern, A. Compressive sensing resonator spectroscopy. Opt. Lett. 42, 25–28 (2017).
Article ADS CAS PubMed Google Scholar
Wang, Z. et al. Single-shot on-chip spectral sensors based on photonic crystal slabs. Nat. Commun. 10, 1020 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhu, Y., Lei, X., Wang, K. X. & Yu, Z. Compact cmos spectral sensor for the visible spectrum. Photon. Res. 7, 961–966 (2019).
Article CAS Google Scholar
Yoon, H. H. et al. Miniaturized spectrometers with a tunable van der Waals junction. Science 378, 296–299 (2022).
Article ADS CAS PubMed Google Scholar
Uddin, M. G. et al. Broadband miniaturized spectrometers with a van der Waals tunnel diode. Nat. Commun. 15, 571 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Cui, X. et al. Miniaturized spectral sensing with a tunable optoelectronic interface. Sci. Adv. 11, eado6886 (2025).
Yang, Z. et al. Single-nanowire spectrometers. Science 365, 1017–1020 (2019).
Article ADS CAS PubMed Google Scholar
Kim, C., Lee, W.-B., Lee, S. K., Lee, Y. T. & Lee, H.-N. Fabrication of 2d thin-film filter-array for compressive sensing spectroscopy. Opt. Lasers Eng. 115, 53–58 (2019).
Article Google Scholar
Kim, C., Ni, P., Lee, K. R. & Lee, H.-N. Mass production-enabled computational spectrometers based on multilayer thin films. Sci. Rep. 12, 4053. https://doi.org/10.1038/s41598-022-08037-y (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS PubMed Google Scholar
Pavitra, V., Dutt, V. B. S. S. I. & Kumar, G. V. S. R. Deep learning based compressive sensing for image reconstruction and inference. In 2022 IEEE 7th International conference for Convergence in Technology (I2CT). 1–7.
Qiu, D., Cheng, Y. & Wang, X. Medical image super-resolution reconstruction algorithms based on deep learning: A survey. Comput. Methods Prog. Biomed. 238, 107590 (2023). https://www.sciencedirect.com/science/article/pii/S0169260723002559.
Ueda, T. et al. Compressed sensing and deep learning reconstruction for women’s pelvic MRI denoising: Utility for improving image quality and examination time in routine clinical practice. Eur. J. Radiol. 134, 109430 (2021). https://www.sciencedirect.com/science/article/pii/S0720048X20306203.
El Samrani, F. Adaptive blind restoration for monochrome and hyperspectral images. Theses, Université de Rennes (2023). https://theses.hal.science/tel-04434526.
Ma, F., Liu, S., Yang, F. & Xu, G. Piecewise weighted smoothing regularization in tight framelet ___domain for hyperspectral image restoration. IEEE Access 11, 1955–1969. https://doi.org/10.1109/ACCESS.2022.3233831 (2023).
Article Google Scholar
He, W., Chen, Y., Yokoya, N., Li, C. & Zhao, Q. Hyperspectral super-resolution via coupled tensor ring factorization. Pattern Recogn. 122, 108280. https://doi.org/10.1016/j.patcog.2021.108280 (2022).
Article Google Scholar
Zhu, Z., Liu, H., Hou, J., Jia, S. & Zhang, Q. Deep amended gradient descent for efficient spectral reconstruction from single rgb images. IEEE Trans. Comput. Imaging 7, 1176–1188. https://doi.org/10.1109/TCI.2021.3124364 (2021).
Article Google Scholar
Zhang, H. et al. Snapshot computational spectroscopy enabled by deep learning. Nanophotonics 13, 4159–4168. https://doi.org/10.1515/nanoph-2024-0328 (2024).
Article PubMed PubMed Central Google Scholar
Bian, L. et al. A broadband hyperspectral image sensor with high spatio-temporal resolution. Nature 635, 73–81. https://doi.org/10.1038/s41586-024-08109-1 (2024).
Article CAS PubMed PubMed Central Google Scholar
Chen, C., Gu, H. & Liu, S. Ultra-simplified diffraction-based computational spectrometer. Light Sci. Appl. 13 (2024). https://doi.org/10.1038/s41377-023-01355-4.
Yako, M. et al. Video-rate hyperspectral camera based on a CMOS-compatible random array of Fabry-Pérot filters. Nat. Photon. 17, 218–223 (2023).
Article ADS CAS Google Scholar
Liu, L., Yan, J., Guo, D., Liu, Y. & Qu, X. Undersampled hyperspectral image reconstruction based on surfacelet transform. J. Sens. 2015, 1–11. https://doi.org/10.1155/2015/256391 (2015).
Article Google Scholar
Tan, H., Li, B. & Crozier, K. B. Optical fiber speckle spectrometer based on reversed-lens smartphone microscope. Sci. Rep. 13 (2023). https://doi.org/10.1038/s41598-023-39778-z.
Bielczynski, L. W., Łącki, M. K., Hoefnagels, I., Gambin, A. & Croce, R. Leaf and plant age affects photosynthetic performance and photoprotective capacity. Plant Physiol. 175, 1634–1648. https://doi.org/10.1104/pp.17.00904 (2017).
Article CAS PubMed PubMed Central Google Scholar
Liu, X. et al. An improved method based on twist and dual-camera design for hyperspectral image reconstruction. In (Dai, Q., Shimura, T. & Zheng, Z. eds.) Optoelectronic Imaging and Multimedia Technology VIII. Vol. 27 (SPIE, 2021). https://doi.org/10.1117/12.2601359.
Chen, Y.-C. et al. Evaluation of band selection for spectrum-aided visual enhancer (save) for esophageal cancer detection. J. Cancer 16, 470–478 (2025). https://www.jcancer.org/v16p0470.htm.
Wang, Y.-P. et al. Spectrum aided vision enhancer enhances mucosal visualization by hyperspectral imaging in capsule endoscopy. Sci. Rep. 14, 22243. https://doi.org/10.1038/s41598-024-73387-8 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kim, C., Park, D. & Lee, H.-N. Convolutional neural networks for the reconstruction of spectra in compressive sensing spectrometers. In Optical Data Science II. Vol. 10937. 61–66 (SPIE).
Kim, C., Park, D. & Lee, H.-N. Compressive sensing spectroscopy using a residual convolutional neural network. Sensors 20, 594 (2020).
Article ADS PubMed PubMed Central Google Scholar
Kokaly, R. F. et al. USGS spectral library version 7. Report 1035 (2017). https://pubs.usgs.gov/publication/ds1035.
University of Eastern Finland. Computational Spectral Imaging Research Group. https://sites.uef.fi/spectral/databases-software/.
Wen, J. et al. Deep learning-based miniaturized all-dielectric ultracompact film spectrometer. ACS Photon. 10, 225–233. https://doi.org/10.1021/acsphotonics.2c01498 (2023).
Article CAS Google Scholar
Arad, B. & Ben-Shahar, O. Sparse recovery of hyperspectral signal from natural rgb images. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII 14. 19–34 (Springer, 2016).
Yasuma, F., Mitsunaga, T., Iso, D. & Nayar, S. K. Generalized assorted pixel camera: Postcapture control of resolution, dynamic range, and spectrum. IEEE Trans. Image Process. 19, 2241–2253 (2010).
Article ADS MathSciNet PubMed Google Scholar
Brown, C. et al. Neural network-based on-chip spectroscopy using a scalable plasmonic encoder. ACS Nano 15, 6305–6315 (2021).
Article CAS PubMed Google Scholar
Zhang, J., Zhu, X. & Bao, J. Solver-informed neural networks for spectrum reconstruction of colloidal quantum dot spectrometers. Opt. Exp. 28, 33656–33672 (2020).
Article Google Scholar
Zhang, J., Zhu, X. & Bao, J. Denoising autoencoder aided spectrum reconstruction for colloidal quantum dot spectrometers. IEEE Sens. J. 21, 6450–6458 (2020).
Article ADS Google Scholar
Garg, R. et al. Sputtering thin films: Materials, applications, challenges and future directions. Adv. Colloid Interface Sci. 330, 103203 (2024). https://www.sciencedirect.com/science/article/pii/S000186862400126X.
Kelly, P. & Arnell, R. Magnetron sputtering: A review of recent developments and applications. Vacuum 56, 159–172 (2000).
Article ADS CAS Google Scholar
Depla, D., Mahieu, S. & Greene, J. Chapter 5 - sputter deposition processes. In (Martin, P. M. ed.) Handbook of Deposition Technologies for Films and Coatings. 3rd Ed. 253–296 (William Andrew Publishing, 2010).
Greene, E. Review article: Tracing the recorded history of thin-film sputter deposition: From the 1800s to 2017. J. Vac. Sci. Technol. A Vac. Surf. Films 35 (2017).
Rathee, D., Ghlwat, M. & Arya, S. K. Evaluation of TiO2/SiO2 dielectric thin films to overcome the challenges of CMOS scaling. In 2011 IEEE Regional Symposium on Micro and Nano Electronics. 270–273 (2011).
Frey, L. et al. High-performance silver-dielectric interference filters for rgbir imaging. Opt. Lett. 43, 1355–1358 (2018). https://opg.optica.org/ol/abstract.cfm?URI=ol-43-6-1355.
Pimenta, S. et al. Design and fabrication of SiO₂/TiO₂ and MgO/TiO₂ based high selective optical filters for diffuse reflectance and fluorescence signals extraction. Biomed. Opt. Exp. 6, 3084–3096. https://doi.org/10.1364/BOE.6.003084 (2015).
Article CAS Google Scholar
Adam, K. D. B. J. A method for stochastic optimization. Vol. 1412. arXiv preprint arXiv:1412.6980 (2014).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. 234–241 (Springer).
Lee, D., Yoo, J. & Ye, J. C. Deep residual learning for compressed sensing MRI. In 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017). 15–18.
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019).
Hartman, T. & Weckhuysen, B. M. Thermally stable TiO2- and SiO2-shell-isolated au nanoparticles for in situ plasmon-enhanced Raman spectroscopy of hydrogenation catalysts. Chem. Eur. J. 24, 3733–3741. https://doi.org/10.1002/chem.201704370 (2018).
Article CAS PubMed Google Scholar
DÃaz, G., Chang, Y.-J. & Philipossian, A. Synthesis, characterization, and thermal stability of SiO2/TiO2/Cr-Ag multilayered nanostructures. J. Nanoparticle Res. 20 (2018). https://doi.org/10.1007/s11051-018-4260-0.
Markanda, S. A review paper on inclusive applications of boro-silicate glasses. Int. J. Res. Anal. Rev. (IJRAR) 10, 327–333 (2023). https://www.researchgate.net/publication/375998720. IJRAR23B3523.
Zhang, C. et al. Stability of sin_x prepared by plasma-enhanced chemical vapor deposition at low temperature. Nanomaterials 11, 3363 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In (Balcan, M. F. & Weinberger, K. Q. eds.) Proceedings of the 33rd International Conference on Machine Learning. Vol. 48. Proceedings of Machine Learning Research. 1050–1059 (PMLR, 2016). https://proceedings.mlr.press/v48/gal16.html.
Tassi, C. R. N., Börner, A. & Triebel, R. Monte Carlo averaging for uncertainty estimation in neural networks. J. Phys. Conf. Ser. 2506, 012004. https://doi.org/10.1088/1742-6596/2506/1/012004 (2023).
Article Google Scholar
Zhang, E., Zhang, X., Yang, S. & Wang, S. Improving hyperspectral image classification using spectral information divergence. IEEE Geosci. Remote Sens. Lett. 11, 249–253 (2014).
Article ADS Google Scholar
Carvalho, T. M., Vellasco, M. & Amaral, J. F. Out-of-distribution detection in deep learning models: A feature space-based approach. In 2023 International Joint Conference on Neural Networks (IJCNN). 1–7 (2023).
Raissi, M., Perdikaris, P. & Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019). https://www.sciencedirect.com/science/article/pii/S0021999118307125.
Cuomo, S. et al. Scientific machine learning through physics-informed neural networks: Where we are and what’s next. J. Sci. Comput. 92 (2022). https://doi.org/10.1007/s10915-022-01939-z.
Hu, Q., Wang, X., Jiang, J., Zhang, X.-P. & Ma, J. Exploring the spectral prior for hyperspectral image super-resolution. IEEE Trans. Image Process. (2024).
Wang, L., Sun, C., Fu, Y., Kim, M. & Huang, H. Hyperspectral image reconstruction using a deep spatial-spectral prior. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 8024–8033 (IEEE Computer Society, 2019). Publisher Copyright: $\copyright$ 2019 IEEE. 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; conference date: 16-06-2019 through 20-06-2019.
Monga, V., Li, Y. & Eldar, Y. IEEE Signal Processing Magazine. Vol. 38. 18–44 (2021). Publisher Copyright: $\copyright$ 1991-2012 IEEE.
Barbastathis, G., Ozcan, A. & Situ, G. On the use of deep learning for computational imaging. Optica 6, 921–943 (2019). https://opg.optica.org/optica/abstract.cfm?URI=optica-6-8-921.
Gregor, K. & LeCun, Y. Learning fast approximations of sparse coding. In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, 399-406 (Omnipress, 2010).
Du, Y. et al. Synaptic 1/f noise injection for overfitting suppression in hardware neural networks. Neuromorph. Comput. Eng. 2, 034006. https://doi.org/10.1088/2634-4386/ac6d05 (2022).
Article Google Scholar
Zhang, Z., Hua, Y., Sun, G., Wang, H. & McLoone, S. Improving the training of the GANs with limited data via dual adaptive noise injection. In Proceedings of the 32nd ACM International Conference on Multimedia, MM ’24. 6725-6734 (ACM, 2024). https://doi.org/10.1145/3664647.3681509.
Moreno-Barea, F. J., Jerez, J. M. & Franco, L. Improving classification accuracy using data augmentation on small data sets. Expert Syst. Appl. 161, 113696 (2020). https://www.sciencedirect.com/science/article/pii/S0957417420305200.
Kadhim, Z. S., Abdullah, H. S. & Ghathwan, K. I. Automatically avoiding overfitting in deep neural networks by using hyper-parameters optimization methods. Int. J. Online Biomed. Eng. (iJOE) 19, 146–162. https://doi.org/10.3991/ijoe.v19i05.38153 (2023).
Article Google Scholar
Arsac, L. et al. Improved polychromatic optical monitoring strategies of thin-film optical filters. In (Lequime, M. & Ristau, D. eds.) Advances in Optical Thin Films VIII. Vol. 70 (SPIE, 2024). https://doi.org/10.1117/12.3017181.
Emadi, A. et al. An UV linear variable optical filter-based micro-spectrometer. Proc. Eng. 5, 416–419. https://doi.org/10.1016/j.proeng.2010.09.135 (2010).
Article CAS Google Scholar
Ravindran, A. et al. Ingaas based gratings for UV-vis spectrometer in prospective MRNA vaccine research. Opt. Quantum Electron. 54, 555. https://doi.org/10.1007/s11082-022-04002-1 (2022).
Article CAS PubMed PubMed Central Google Scholar
Winkowski, P. & Marszałek, K. W. Wide band antireflective coatings Al2O3 / HfO2 / MgF2 for UV region. In (Szczepanski, P., Kisiel, R. & Romaniuk, R. S. eds.) Electron Technology Conference 2013. Vol. 8902. 890228. International Society for Optics and Photonics (SPIE, 2013). https://doi.org/10.1117/12.2030511.
Edmund Optics Inc. The correct material for infrared (IR) applications. https://www.edmundoptics.com/knowledge-center/application-notes/optics/the-correct-material-for-infrared-applications/ (2025). Copyright 2025, Edmund Optics Inc.
Dong, C. et al. Hybrid integrated germanium-on-zinc selenide waveguides for enhanced longwave infrared sensing. In (Sood, A. K., Wijewarnasuriya, P. & D’Souza, A. I. eds.) Infrared Sensors, Devices, and Applications XIV. Vol. 18 (SPIE, 2024).
Hoffman, J. M. & Wolfe, W. L. Cryogenic refractive indices of ZnSe, Ge, and Si at 10.6 µm. Appl. Opt. 30, 4014–4016 (1991). https://opg.optica.org/ao/abstract.cfm?URI=ao-30-28-4014.
Chen, J. et al. An ultrahigh-resolution spectrometer using parallel double gratings. Results Phys. 45, 106258 (2023). https://www.sciencedirect.com/science/article/pii/S2211379723000517.
Thorlabs, Inc. Tunable narrow bandpass Fabry-Perot filters. https://www.thorlabs.com/newgrouppage9.cfm?objectgroup_id=16346 (2025). Technical Information. Accessed 17 Mar 2025.
Costa, M. S. et al. Multilayer thin-film optical filters for reflectance-based malaria diagnostics. Micromachines 12 (2021). https://www.mdpi.com/2072-666X/12/8/890.
Tesa, M. Fourier transform photoluminescence in a benchtop FTIR spectrometer. Technical Note TN_IR01, Edinburgh Instruments Ltd. (2023). https://www.edinst.com/wp-content/uploads/2023/02/FT-PL-tech-note_23020870.pdf. Accessed 20 March 2025.
Yang, Z., Albrow-Owen, T., Cai, W. & Hasan, T. Miniaturization of optical spectrometers. Science 371 (2021). https://doi.org/10.1126/science.abe0722.
Li, A. et al. Advances in cost-effective integrated spectrometers. Light Sci. Appl. 11, 174 (2022). https://doi.org/10.1038/s41377-022-00853-1.
Xue, Q. et al. Advances in miniaturized computational spectrometers. Adv. Sci. 11 (2024). https://doi.org/10.1002/advs.202404448.
Wang, G. et al. Fabry–Perot multilayer for spectral imaging applications. J. Opt. Soc. Am. B 41, D75–D80 (2024). https://opg.optica.org/josab/abstract.cfm?URI=josab-41-11-D75.
Vignaux, M., Lemarchand, F., Grezes-Besset, C. & Lumeau, J. In situ optical monitoring of Fabry-Perot multilayer structures: Analysis of current techniques and optimized procedures. Opt. Exp. 25, 18040–18055 (2017). https://hal.archives-ouvertes.fr/hal-01568791.
Lin, Y.-S., Dai, J., Zeng, Z. & Yang, B.-R. Metasurface color filters using aluminum and lithium niobate configurations. Nanoscale Res. Lett. 15, 77. https://doi.org/10.1186/s11671-020-03310-3 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, T., Mu, G., Hao, Q. & Tang, X. Pbs-based swir micro-spectrometer with on-chip Fabry–Perot filter array. Opt. Lett. 49, 5435–5438 (2024). https://opg.optica.org/ol/abstract.cfm?URI=ol-49-19-5435.
SOURCES Editors. Fourier transform infrared spectroscopy. https://sources.com/SSR/Docs/SSRW-Fourier_transform_infrared_spectroscopy.htm (2025). Accessed 20 Mar 2025, based on Wikipedia articles with modifications and additional content by SOURCES editors. Licensed under CC-BY-SA 3.0 and GFDL.
Guo, K., Yang, D., Liu, B. & Guo, Z. Random broadband filters based on combination of metasurface and multilayer thin films for hyperspectral imaging. J. Phys. D Appl. Phys. 57, 315103. https://doi.org/10.1088/1361-6463/ad4370 (2024).
Article ADS CAS Google Scholar
Qian, Q., Ti, S. & Wang, C. All-dielectric ultra-thin metasurface angular filter. Opt. Lett. 44, 3984–3987 (2019). https://opg.optica.org/ol/abstract.cfm?URI=ol-44-16-3984.
Innovation with integrity: FT-IR spectrometer for highest spectral resolution. http://bruker.poznan.pl/wp-content/uploads/2018/12/IFS125HR_brochure_EN.pdf (2018). Brochure, Accessed 20 Mar 2025.
Rutkowski, L., Masłowski, P., Johansson, A. C., Khodabakhsh, A. & Foltynowicz, A. Optical frequency comb Fourier transform spectroscopy with sub-nominal resolution and precision beyond the Voigt profile. J. Quant. Spectrosc. Radiat. Transf. 204, 63–73 (2018). https://www.sciencedirect.com/science/article/pii/S0022407317300201.
Nicolau, S. T. & Matzger, A. J. An evaluation of resolution, accuracy, and precision in FT-IR spectroscopy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 319, 124545 (2024). https://www.sciencedirect.com/science/article/pii/S138614252400711X.

Download references

Acknowledgements

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP)-ITRC (Information Technology Research Center) grant funded by the Korea government (MSIT) (IITP-2025-RS-2021-II211835) (to H.-N.L.); the Post-doc Value up Research Scientist Project grant funded by the GIST in 2025 (to H.-NL); and the National Research Foundation of Korea (NRF) grant funded by the MSIT with grant no. RS-2025-00561110 (to H.H.Y.).

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea
David S. Bhatti & Heung-No Lee
Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea
Jioh Lee & Youngin Choi
Smart Electronics Research Center, Korea Electronics Technology Institute, Jeonju, 54853, Republic of Korea
Cheolsun Kim
Department of Semiconductor Engineering, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea
Hoon Hahn Yoon

Authors

David S. Bhatti
View author publications
Search author on:PubMed Google Scholar
Jioh Lee
View author publications
Search author on:PubMed Google Scholar
Cheolsun Kim
View author publications
Search author on:PubMed Google Scholar
Youngin Choi
View author publications
Search author on:PubMed Google Scholar
Hoon Hahn Yoon
View author publications
Search author on:PubMed Google Scholar
Heung-No Lee
View author publications
Search author on:PubMed Google Scholar

Contributions

D.S.B., J.L., C.K., H.H.Y., and H.-N.L. conceptualized the idea. D.S.B., J.L., and C.K. constructed and conducted the optical experiment setup. J.L., C.K., and Y.C. developed the DL architecture for spectral reconstruction. J.L., C.K., and Y.C. visualized the data under the supervision of H.H.Y. and H.-N.L., while D.S.B. formulated the mathematical representation of the problem formulation for DL-based computational spectroscopy. D.S.B. and J.L. conducted experiments and wrote the first draft of the paper. H.H.Y. provided comments on environmental factors and comparisons with conventional and modern spectrometers. All authors contributed to technical discussions and revised the manuscript.

Corresponding author

Correspondence to Heung-No Lee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Bhatti, D.S., Lee, J., Kim, C. et al. Deep learning-based single-shot computational spectrometer using multilayer thin films. Sci Rep 15, 21232 (2025). https://doi.org/10.1038/s41598-025-06691-6

Download citation

Received: 18 December 2024
Accepted: 10 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-06691-6

Subjects

Abstract

Similar content being viewed by others

Mass production-enabled computational spectrometers based on multilayer thin films

Long-wave infrared computational multispectral metasurface and spectral reconstruction method

Compact angle-resolved metasurface spectrometer

Introduction

Problem restatement

Related work

Methods

Fabrication of the MTF filter array

SEM analysis of the MTF filter

Problem formulation: a system model

DL-based computational spectroscopy

Experimentation

Experimental setup

DL-architecture

Training and testing

Results and discussions

Reconstruction of test spectra

Reconstruction of drink spectra

Key considerations in practical scenarios

Environmental factors

Spectra out of learning distribution and overfitting

Reproducibility and scalability

Critical analysis on extension of the proposed MTF filter

Comparison with conventional and modern spectrometers

Open research challenges

Conclusion

Key terms and their definition

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links